2006 IdentifyingDocTopicsUsingWikipedCatNet
Jump to navigation
Jump to search
- (Schönhofen, 2006) ⇒ Peter Schönhofen. (2006). “Identifying Document Topics Using the Wikipedia Category Network.” In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006). doi:10.1109/WI.2006.92
Subject Headings: Wikipedia Category Network.
Notes
Quotes
Abstract
- In the last few years the size and coverage of Wikipedia, a community edited, freely available on-line encyclopedia has reached the point where it can be effectively used to identify topics discussed in a document, similarly to an ontology or taxonomy. In this paper we will show that even a fairly simple algorithm that exploits only the titles and categories of Wikipedia articles can characterize documents by Wikipedia categories surprisingly well. We test the reliability of our method by predicting categories of Wikipedia articles themselves based on their bodies, and also by performing classification and clustering on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of (or in addition to) their texts.
-,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2006 IdentifyingDocTopicsUsingWikipedCatNet | Peter Schönhofen | Identifying Document Topics Using the Wikipedia Category Network | 10.1109/WI.2006.92 |