2005 ASemanticKernelToClassifyText

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

2006

Quotes

Abstract

  • Web-mediated access to distributed information is a complex problem. Before any learning can start, Web objects (e.g. texts) have to be detected and ¯ltered accurately. In this perspective, text categorization is a useful device to ¯lter out irrelevant evidence before other learning processes take place on huge sources of candidate information. The drawback is the need of a large number of training documents. One way to reduce such number relates to the use of more e®ective document similarities based on prior knowledge. Unfortunately, previous work has shown that such information (e.g. WordNet) causes the decrease of retrieval accuracy.
  • In this paper we propose kernel functions to add prior knowledge to learning algorithms for document classi¯cation. Such kernels use a term similarity measure based on the WordNet hierarchy. The kernel trick is used to implement such space in a balanced and statistically coherent way. Cross-validation results show the bene¯t of the approach for the Support Vector Machines when few training examples are available.

References


,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2005 ASemanticKernelToClassifyTextRoberto Basili
Alessandro Moschitti
Marco Cammisa
A Semantic Kernel to Classify Text with Very Few Training Exampleshttp://dit.unitn.it/~moschitt/articles/ICML2005-ws.pdf