2008 BuildingSemanticKernelsforTextC
- (Wang et al., 2008) ⇒ Pu Wang, and Carlotta Domeniconi. (2008). “Building Semantic Kernels for Text Classification Using Wikipedia.” In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008). doi:10.1145/1401890.1401976
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%22Building+semantic+kernels+for+text+classification+using+wikipedia%22+2008
- http://portal.acm.org/citation.cfm?doid=1401890.1401976&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
Document classification presents difficult challenges due to the sparsity and the high dimensionality of text data, and to the complex semantics of the natural language. The traditional document representation is a word-based vector (Bag of Words, or BOW), where each dimension is associated with a term of the dictionary containing all the words that appear in the corpus. Although simple and commonly used, this representation has several limitations. It is essential to embed semantic information and conceptual patterns in order to enhance the prediction capabilities of classification algorithms. In this paper, we overcome the shortages of the BOW approach by embedding background knowledge derived from Wikipedia into a semantic kernel, which is then used to enrich the representation of documents. Our empirical evaluation with real data sets demonstrates that our approach successfully achieves improved classification accuracy with respect to the BOW technique, and to other recently developed methods.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2008 BuildingSemanticKernelsforTextC | Pu Wang Carlotta Domeniconi | Building Semantic Kernels for Text Classification Using Wikipedia | 10.1145/1401890.1401976 |