2007 CoClusteringbasedClassification
- (Dai et al., 2007) ⇒ Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. (2007). “Co-clustering based Classification for Out-of-domain Documents.” In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. doi:10.1145/1281192.1281218
Subject Headings:
Notes
Cited By
- ~68 http://scholar.google.com/scholar?q=%22Co-clustering+based+classification+for+out-of-domain+documents%22+2007
- http://dl.acm.org/citation.cfm?doid=1281192.1281218&preflayout=flat#citedby
Quotes
Author Keywords
Classification, Co-clustering, Out-of-domain, Kullback-Leibler
Abstract
In many real world applications, labeled data are in short supply. It often happens that obtaining labeled data in a new domain is expensive and time consuming, while there may be plenty of labeled data from a related but different domain. Traditional machine learning is not able to cope well with learning across different domains. In this paper, we address this problem for a text-mining task, where the labeled data are under one distribution in one domain known as in-domain data, while the unlabeled data are under a related but different domain known as out-of-domain data. Our general goal is to learn from the in-domain and apply the learned knowledge to out-of-domain. We propose a co-clustering based classification (CoCC) algorithm to tackle this problem. Co-clustering is used as a bridge to propagate the class structure and knowledge from the in-domain to the out-of-domain. We present theoretical and empirical analysis to show that our algorithm is able to produce high quality classification results, even when the distributions between the two data are different. The experimental results show that our algorithm greatly improves the classification performance over the traditional learning algorithms.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2007 CoClusteringbasedClassification | Qiang Yang Wenyuan Dai Gui-Rong Xue Yong Yu | Co-clustering based Classification for Out-of-domain Documents | 10.1145/1281192.1281218 |