2014 ADirichletMultinomialMixtureMod
- (Yin & Wang, 2014) ⇒ Jianhua Yin, and Jianyong Wang. (2014). “A Dirichlet Multinomial Mixture Model-based Approach for Short Text Clustering.” In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2014) Journal. ISBN:978-1-4503-2956-9 doi:10.1145/2623330.2623715
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222014%22+A+Dirichlet+Multinomial+Mixture+Model-based+Approach+for+Short+Text+Clustering
- http://dl.acm.org/citation.cfm?id=2623330.2623715&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
Short text clustering has become an increasingly important task with the popularity of social media like Twitter, Google +, and Facebook. It is a challenging problem due to its sparse, high-dimensional, and large-volume characteristics. In this paper, we proposed a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model for short text clustering (abbr. to GSDMM). We found that GSDMM can infer the number of clusters automatically with a good balance between the completeness and homogeneity of the clustering results, and is fast to converge. GSDMM can also cope with the sparse and high-dimensional problem of short texts, and can obtain the representative words of each cluster. Our extensive experimental study shows that GSDMM can achieve significantly better performance than three other clustering models.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2014 ADirichletMultinomialMixtureMod | Jianyong Wang Jianhua Yin | A Dirichlet Multinomial Mixture Model-based Approach for Short Text Clustering | 10.1145/2623330.2623715 | 2014 |