2010 DocumentClusteringviaDirichletP
- (Yu et al., 2010) ⇒ Guan Yu, Ruizhang Huang, and Zhaojun Wang. (2010). “Document Clustering via Dirichlet Process Mixture Model with Feature Selection.” In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010). doi:10.1145/1835804.1835901
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%22Document+clustering+via+dirichlet+process+mixture+model+with+feature+selection%22+2010
- http://portal.acm.org/citation.cfm?id=1835901&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
One essential issue of document clustering is to estimate the appropriate number of clusters for a document collection to which documents should be partitioned. In this paper, we propose a novel approach, namely DPMFS, to address this issue. The proposed approach is designed 1) to group documents into a set of clusters while the number of document clusters is determined by the Dirichlet process mixture model automatically; 2) to identify the discriminative words and separate them from irrelevant noise words via stochastic search variable selection technique. We explore the performance of our proposed approach on both a synthetic dataset and several realistic document datasets. The comparison between our proposed approach and state-of-the-art document clustering approaches indicates that our approach is robust and effective for document clustering.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2010 DocumentClusteringviaDirichletP | Guan Yu Ruizhang Huang Zhaojun Wang | Document Clustering via Dirichlet Process Mixture Model with Feature Selection | KDD-2010 Proceedings | 10.1145/1835804.1835901 | 2010 |