2007 WikipediaMiningforanAssocWeb...

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Abstract

Wikipedia has become a huge phenomenon on the WWW. As a corpus for knowledge extraction, it has various impressive characteristics such as a huge amount of articles, live updates, a dense link structure, brief link texts and URL identification for concepts. In this paper, we propose an efficient link mining method pfibf (Path Frequency - Inversed Backward link Frequency) and the extension method “forward / backward link weighting (FB weighting)” in order to construct a huge scale association thesaurus. We proved the effectiveness of our proposed methods compared with other conventional methods such as cooccurrence analysis and TF-IDF.

References

  • Giles, J.: Internet encyclopaedias go head to head. Nature 438 (2005) 900–901
  • Nakayama, K., Hara, T., Nishio, S.: A thesaurus construction method from large scale web dictionaries. In: Proceedings of IEEE International Conference on Advanced Information Networking and Applications (AINA 2007). (2007) 932–939
  • Ruiz-Casado, M., Alfonseca, E., Castells, P.: Automatic assignment of wikipedia encyclopedic entries to wordnet synsets. In: Proceedings of Advances in Web Intelligence Third International Atlantic Web IntelligenceConference (AWIC 2005). (2005) 380–386
  • Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using Wikipedia. In: Proceedings of National Conference on Artificial Intelligence (AAAI-06), Boston, Mass. (2006) 1419–1424
  • Milne, D., Medelyan, O., Witten, I.H.: Mining domain-specific thesauri from wikipedia: A case study. In: Proceedings of ACM International Conference on Web Intelligence (WI’06). (2006) 442–448
  • Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipediabased explicit semantic analysis. In: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI 2007). (2007) 1606–1611
  • Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGrawHill Book Company (1984)
  • Lawrence, P., Sergey, B., Rajeev, M., Terry, W.: The pagerank citation ranking: Bringing order to the web. Technical Report, Stanford Digital Library Technologies Project (1999)
  • Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (5) (1999) 604–632
  • Davison, B.D.: Topical locality in the web. Proceedings of the ACM SIGIR (2000) 272–279
  • Schutze, H., Pedersen, J.O.: A cooccurrence-based thesaurus and two applications to information retrieval. International Journal of Information Processing and Management 33(3) (1997) 307–318
  • Finkelstein, L., Gabrilovich, E., Matias, Y., [Rivlin, E.]], Solan, Z., Wolfman, G., Ruppin, E.: Placing search in context: the concept revisited. ACM Trans. Inf. Syst. 20(1) (2002) 116–131
  • Chen, H., Yim, T., Fye, D.: Automatic thesaurus generation for an electronic community system. Journal of the American Society for Information Science 46(3) (1995) 175–19,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 WikipediaMiningforanAssocWeb...Kotaro Nakayama
Takahiro Hara
Shojiro Nishio
Wikipedia Mining for an Association Web Thesaurus ConstructionWeb Information Systems Engineeringhttp://wikipedia-lab.org/ja/images/9/90/Wise2007.pdf2007