Sugato Basu
Jump to navigation
Jump to search
Sugato Basu is a person.
- See: Constrained Clustering.
References
- Professional Homepage: http://userweb.cs.utexas.edu/~sugato/
- DBLP Author Page: http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/b/Basu:Sugato.html
2009
- (Sculley et al., 2009) ⇒ D. Sculley, Robert G. Malkin, Sugato Basu, and Roberto J. Bayardo. (2009). “Predicting Bounce Rates in Sponsored Search Advertisements.” In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2009). doi:10.1145/1557019.1557161
2008
- (Basu et al., 2008) ⇒ Sugato Basu, Ian Davidson, and Kiri Wagstaff; editors. (2008). “Constrained Clustering: Advances in Algorithms, Theory, and Applications." Chapman & Hall. ISBN:1584889969
2005
- (Bilenko et al., 2005) ⇒ Mikhail Bilenko, Sugato Basu, and Mehran Sahami. (2005). “Adaptive Product Normalization: Using Online Learning for Record Linkage in Comparison Shopping.” In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM-2005).
- (Kulis et al., 2005) ⇒ Brian Kulis, Sugato Basu, Inderjit Dhillon, and Raymond Mooney. (2005). “Semi-supervised Graph Clustering: A Kernel Approach.” In: Proceedings of the 22nd International Conference on Machine learning. doi:10.1145/1102351.1102409
2004
- (Basu et al., 2004) ⇒ Sugato Basu, Mikhail Bilenko, Raymond Mooney. (2004). “A Probabilistic Framework for Semi-Supervised Clustering.” In: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004). doi:10.1145/1014052.1014062
2002
- (Basu et al., 2002) ⇒ Sugato Basu, Arindam Banerjee, and Raymond Mooney. (2002). “Semi-supervised Clustering by Seeding.” In: Proceedings of the 19th International Conference on Machine Learning (ICML 2002).
- Category: Text Categorization and Clustering, Unsupervised and Semi-Supervised Learning and Clustering
- Cited by ~244 http://scholar.google.com/scholar?cites=14573234073707527294
- ABSTRACT: Semi-supervised clustering uses a small amount of labeled data to aid and bias the clustering of unlabeled data. This paper explores the use of labeled data to generate initial seed clusters, as well as the use of constraints generated from labeled data to guide the clustering process. It introduces two semi-supervised variants of KMeans clustering that can be viewed as instances of the EM algorithm, where labeled data provides prior information about the conditional distributions of hidden category labels. Experimental results demonstrate the advantages of these methods over standard random seeding and COP-KMeans, a previously developed semi-supervised clustering algorithm.