2000 RetrievingDescrPhrasesfromLargeAmountsofFreeText

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Information retrieval, descriptive phrase, large corpora,

Notes

Cited By

Quotes

Abstract

This paper presents a system that retrieves descriptive phrases of proper nouns from free text. Sentences holding the specified noun are ranked using a technique based on pattern matching, word counting, and sentence location. No domain specific knowledge is used. Experiments show the system able to rank highly those sentences that contain phrases describing or defining the query noun. In contrast to existing methods, this system does not use parsing techniques but still achieves high levels of accuracy. From the esults of a large-scale experiment, it is speculated that the success of this simpler method is due to the high quantities of free text being searched. Parallels between this work and recent findings in the very large corpus track of TREC are drawn.

References

  • Caraballo, S.A., Charniak, E., "Determining the specificity of nouns from text", Proceedings of the joint SIGDAT conference on empirical methods in natural language processing (EMNLP) and very large corpora (VLC), 63-70, (1999).
  • Chinchor, N.A., "Overview of MUC-7/MET-2", Proceedings of the Message Understanding Conference Proceedings MUC-7, (1998).
  • Cooper, W.S., "Fact Retrieval and Deductive QuestionAnswering Information Retrieval Systems", Journal of the ACM, ACM Press, 11(2), 117-137, (1964).
  • Hawking, D., Thistlewaite, P., "Overview of TREC-6 Very Large Collection Track", NIST Special Publication 500-240: The Sixth Text REtrieval Conference (TREC 6), E.M. Voorhees, D.K. Harman (eds.), 93-106, (1997).
  • Hearst, M., "Automatic Acquisition of Hyponyms from Large Text Corpora", Proceedings of the 14th International Conference on Computational Linguistics (COLING 92), 539-545, (1992).
  • Hearst, M.A., "Automated Discovery of WordNet Relations", WordNet: an electronic lexical database, C. Fellbaum (ed.), MIT Press, (1998).
  • Kupiec, J., "MURAX: A Robust Linguistic Approach For Question Answering Using An On-Line Encyclopedia", Proceedings of the 16th annual international ACM SIGIR conference on Research and Development in Information Retrieval, 181-190, (1993).
  • Miller, G.A., "WordNet: A lexical database for English", Communications of the ACM, 38(11), 39- 41,(1995).
  • Moldovan, D., Harabagiu, S., Pasca, M., Mihalcea, R., Goodrum, R., Girju, R., Rus, V., "Lasso: A Tool for Surfing the Answer Net", NIST Special Publication XXX-XXX: The 8th Text REtrieval Conference (TREC 8), (1999).
  • Porter, M.F., "An algorithm for suffix stripping", Program - automated library and information systems, 14(3), 130-137, (1980).
  • Radev, D.R., McKeown, K.R., "Building a Generation Knowledge Source using Internet-Accessible Newswire", Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP), 221-228, (1997).
  • Singhal, A., Abney, S., Bacchiani, M., Collins, M., Hindle, D., Pereira, F., "AT&T at TREC-8", NIST Special Publication XXX-XXX: The 8th Text REtrieval Conference (TREC 8), (1999).
  • Voorhees, E.M., Tice, D.M., "The TREC-8 Question Answering Track Evaluation", NIST Special Publication XXX-XXX: The 8th Text REtrieval Conference (TREC 8), (1999).

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2000 RetrievingDescrPhrasesfromLargeAmountsofFreeTextHideo Joho
Mark Sanderson
Retrieving Descriptive Phrases from Large Amounts of Free texthttp://dis.shef.ac.uk/mark/publications/my papers/CIKM00.pdf