2001 CoClusteringDocsAndWords
- (Dhillon, 2001) ⇒ Inderjit S. Dhillon. (2001). “Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning.” In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001) doi:10.1145/502512.502550
Subject Headings: Co-clustering Algorithm, Document Clustering Task, Word Clustering Task.
Notes
Cited By
Quotes
Abstract
Both document clustering and word clustering are well studied problems. Most existing algorithms cluster documents and words separately but not simultaneously. In this paper we present the novel idea of modeling the document collection as a bipartite graph between documents and words, using which the simultaneous clustering problem can be posed as a bipartite graph partitioning problem. To solve the partitioning problem, we use a new spectral co-clustering algorithm that uses the second left and right singular vectors of an appropriately scaled word-document matrix to yield good bipartitionings. The spectral algorithm enjoys some optimality properties; it can be shown that the singular vectors solve a real relaxation to the NP-complete graph bipartitioning problem. We present experimental results to verify that the resulting co-clustering algorithm works well in practice.
References
- …
- (Strehl et al., 2000) ⇒ Alexander Strehl, Joydeep Ghosh, and Raymond Mooney. (2000). “Impact of Similarity Measures on Web-page Clustering.” In: Workshop at AAAI 2000 on Artificial Intelligence for Web Search.
- …,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2001 CoClusteringDocsAndWords | Inderjit S. Dhillon | Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning | KDD-2001 | http://www.cis.upenn.edu/group/datamining/ReadingGroup/papers/dhillon2001.pdf | 10.1145/502512.502550 | 2001 |