1997 ProjectionsForEfficientDocumentClustering
- (Schütze & Silverstein, 1997) ⇒ Hinrich Schütze, and Craig Silverstein. (1997). “Projections for Efficient Document Clustering.” In: ACM SIGIR Forum, 31. doi:10.1145/278459.258539
Subject Headings: Lexical Semantic Similarity Function, Text Clustering Algorithm, Latent Semantic Indexing
Cited By
1999
- (Larsen and Aone, 1999) ⇒ Bjornar Larsen, and Chinatsu Aone. (1999). “Fast and Effective Text Mining Using Linear-time Document Clustering.” In: Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-1999).
Quotes
Abstract
Clustering is increasing in importance, but linear- and even constant-time clustering algorithms are often too slow for real-time applications. A simple way to speed up clustering is to speed up the distance calculations at the heart of clustering routines. We study two techniques for improving the cost of distance calculations, LSI and truncation, and determine both how much these techniques speed up clustering and how much they affect the quality of the resulting clusters. We find that the speed increase is significant while — surprisingly — the quality of clustering is not adversely affected. We conclude that truncation yields clusters as good as those produced by full-profile clustering while offering a significant speed advantage.
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
1997 ProjectionsForEfficientDocumentClustering | Hinrich Schütze Craig Silverstein | Projections for Efficient Document Clustering | http://dx.doi.org/10.1145/278459.258539 | 10.1145/278459.258539 |