2005 AHybridUnsupApprForDocClust
Jump to navigation
Jump to search
- (Surdeanu et al., 2005) ⇒ Mihai Surdeanu, Jordi Turmo, Alicia Ageno. (2005). “A Hybrid Unsupervised Approach for Document Clustering.” In: Proceedings of the eleventh ACM SIGKDD International Conference on Knowledge discovery in data mining (KDD-2005). doi:10.1145/1081870.1081957
Subject Headings: Unsupervised Text Clustering Algorithm, Text Clustering Algorithm.
Notes
Cited By
~17 http://scholar.google.com/scholar?cites=9979505397248497791
Quotes
Abstract
- We propose a hybrid, unsupervised document clustering approach that combines a hierarchical clustering algorithm with Expectation Maximization. We developed several heuristics to automatically select a subset of the clusters generated by the first algorithm as the initial points of the second one. Furthermore, our initialization algorithm generates not only an initial model for the iterative refinement algorithm but also an estimate of the model dimension, thus eliminating another important element of human supervision. We have evaluated the proposed system on five real-world document collections. The results show that our approach generates clustering solutions of higher quality than both its individual components.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2005 AHybridUnsupApprForDocClust | Mihai Surdeanu Jordi Turmo Alicia Ageno | A Hybrid Unsupervised Approach for Document Clustering | http://www.surdeanu.name/mihai/papers/kdd05.pdf | 10.1145/1081870.1081957 |