2008 TopicIndexingWithWikipedia
Jump to navigation
Jump to search
- (Medelyan et al., 2008) ⇒ Olena Medelyan, Ian H. Witten, David N. Milne. (2008). “Topic Indexing with Wikipedia.” In: Proceedings of the AAAI 2008 Workshop on Wikipedia and Artificial Intelligence (WIKIAI 2008).
Subject Headings: Topic Indexing Task, Wikipedia.
Notes
Cited By
- (Milne & Witten, 2008a) ⇒ David N. Milne, and Ian H. Witten. (2008). “Learning to Link with Wikipedia.” In: Proceeding of the 17th ACM Conference on Information and Knowledge Management, (CIKM 2008). doi:10.1145/1458082.1458150
- Medelyan et al. (2008) make these similarities very clear in their approach to topic indexing with Wikipedia, and even reuse Wikify’s approach for detecting significant terms. They differ in how they disambiguate terms, however. They gain similar results much more cheaply by balancing (a) the commonness (or prior probability) of each sense and (b) how the sense relates to its surrounding context. This approach explained in Section 3.1, where we improve upon it by weighting context terms and using machine learning to balanced commonness and relatedness.
Quotes
Abstract
- Wikipedia can be utilized as a controlled vocabulary for identifying the main topics in a document, with article titles serving as index terms and redirect titles as their synonyms. Wikipedia contains over 4M such titles covering the terminology of nearly any document collection. This permits controlled indexing in the absence of manually created vocabularies. We combine state-of-the-art strategies for automatic controlled indexing with Wikipedia’s unique property — a richly hyperlinked encyclopedia. We evaluate the scheme by comparing automatically assigned topics with those chosen manually by human indexers. Analysis of indexing consistency shows that our algorithm performs as well as the average person.
,