2000 RetrievingDescrPhrasesfromLargeAmountsofFreeText
- (Joho & Sanderson, 2000) ⇒ Hideo Joho, and Mark Sanderson. (2000). “Retrieving Descriptive Phrases from Large Amounts of Free Text.” In: Proceedings of the ninth International Conference on Information and knowledge management (CIKM 2000). doi:10.1145/354756.354817
Subject Headings: Information retrieval, descriptive phrase, large corpora,
Notes
Cited By
Quotes
Abstract
This paper presents a system that retrieves descriptive phrases of proper nouns from free text. Sentences holding the specified noun are ranked using a technique based on pattern matching, word counting, and sentence location. No domain specific knowledge is used. Experiments show the system able to rank highly those sentences that contain phrases describing or defining the query noun. In contrast to existing methods, this system does not use parsing techniques but still achieves high levels of accuracy. From the esults of a large-scale experiment, it is speculated that the success of this simpler method is due to the high quantities of free text being searched. Parallels between this work and recent findings in the very large corpus track of TREC are drawn.
References
- Caraballo, S.A., Charniak, E., "Determining the specificity of nouns from text", Proceedings of the joint SIGDAT conference on empirical methods in natural language processing (EMNLP) and very large corpora (VLC), 63-70, (1999).
- Chinchor, N.A., "Overview of MUC-7/MET-2", Proceedings of the Message Understanding Conference Proceedings MUC-7, (1998).
- Cooper, W.S., "Fact Retrieval and Deductive QuestionAnswering Information Retrieval Systems", Journal of the ACM, ACM Press, 11(2), 117-137, (1964).
- Hawking, D., Thistlewaite, P., "Overview of TREC-6 Very Large Collection Track", NIST Special Publication 500-240: The Sixth Text REtrieval Conference (TREC 6), E.M. Voorhees, D.K. Harman (eds.), 93-106, (1997).
- Hearst, M., "Automatic Acquisition of Hyponyms from Large Text Corpora", Proceedings of the 14th International Conference on Computational Linguistics (COLING 92), 539-545, (1992).
- Hearst, M.A., "Automated Discovery of WordNet Relations", WordNet: an electronic lexical database, C. Fellbaum (ed.), MIT Press, (1998).
- Kupiec, J., "MURAX: A Robust Linguistic Approach For Question Answering Using An On-Line Encyclopedia", Proceedings of the 16th annual international ACM SIGIR conference on Research and Development in Information Retrieval, 181-190, (1993).
- Miller, G.A., "WordNet: A lexical database for English", Communications of the ACM, 38(11), 39- 41,(1995).
- Moldovan, D., Harabagiu, S., Pasca, M., Mihalcea, R., Goodrum, R., Girju, R., Rus, V., "Lasso: A Tool for Surfing the Answer Net", NIST Special Publication XXX-XXX: The 8th Text REtrieval Conference (TREC 8), (1999).
- Porter, M.F., "An algorithm for suffix stripping", Program - automated library and information systems, 14(3), 130-137, (1980).
- Radev, D.R., McKeown, K.R., "Building a Generation Knowledge Source using Internet-Accessible Newswire", Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP), 221-228, (1997).
- Singhal, A., Abney, S., Bacchiani, M., Collins, M., Hindle, D., Pereira, F., "AT&T at TREC-8", NIST Special Publication XXX-XXX: The 8th Text REtrieval Conference (TREC 8), (1999).
- Voorhees, E.M., Tice, D.M., "The TREC-8 Question Answering Track Evaluation", NIST Special Publication XXX-XXX: The 8th Text REtrieval Conference (TREC 8), (1999).
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2000 RetrievingDescrPhrasesfromLargeAmountsofFreeText | Hideo Joho Mark Sanderson | Retrieving Descriptive Phrases from Large Amounts of Free text | http://dis.shef.ac.uk/mark/publications/my papers/CIKM00.pdf |