Cluster Purity Metric
(Redirected from purity)
Jump to navigation
Jump to search
A Cluster Purity Metric is a Performance Metric that ...
- AKA: Cluster Purity, Purity.
- See: Text Clustering Task, F-Score Metric, Normalized Mutual Information Metric.
References
2009
- (Hu et al., 1999) ⇒ Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, and Xiaohua Zhou. (2009). “Exploiting Wikipedia as External Knowledge for Document Clustering.” In: Proceedings of ACM SIGKDD Conference (KDD-2009). doi:10.1145/1557019.1557066
- Cluster quality is evaluated by three metrics, purity [14], F-score [10], and normalized mutual information (NMI) [15]. Purity assumes that all samples of a cluster are predicted to be members of the actual dominant class for that cluster. … A merit of NMI is that it does not necessarily increase when the number of clusters increases. All the three metrics range from 0 to 1, and the higher their value, the better the clustering quality is.
2001
- Ying Zhao, and George Karypis. (2001). “Criterion Functions for Document Clustering: Experiments and analysis." Technical Report TR #01--40, Department of Computer Science, University of Minnesota, Minneapolis, MN.