2015 LinearTimeSamplersforSupervised
- (Zheng et al., 2015) ⇒ Xun Zheng, Yaoliang Yu, and Eric P. Xing. (2015). “Linear Time Samplers for Supervised Topic Models Using Compositional Proposals.” In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2015). ISBN:978-1-4503-3664-2 doi:10.1145/2783258.2783371
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222015%22+Linear+Time+Samplers+for+Supervised+Topic+Models+Using+Compositional+Proposals
- http://dl.acm.org/citation.cfm?id=2783258.2783371&preflayout=flat#citedby
Quotes
Author Keywords
- General; inference; large margin classification; mcmc; probabilistic algorithms; regression; scale mixtures; topic models
Abstract
Topic models are effective probabilistic tools for processing large collections of unstructured data. With the exponential growth of modern industrial data, and consequentially also with our ambition to explore much bigger models, there is a real pressing need to significantly scale up topic modeling algorithms, which has been taken up in lots of previous works, culminating in the recent fast Markov chain Monte Carlo sampling algorithms in [10, 23] for the unsupervised latent Dirichlet allocation (LDA) formulations.
In this work we extend the recent sampling advances for unsupervised LDA models to supervised tasks. We focus on the Gibbs MedLDA model [27] that is able to simultaneously discover latent structures and make accurate predictions. By combining a set of sampling techniques we are able to reduce the O (K 3 + DK 2 + DNK complexity in [27] to O (DK + DN) when there are K topics and D documents with average length N. To our best knowledge, this is the first linear time sampling algorithm for supervised topic models. Our algorithm requires minimal modifications to incorporate most loss functions in a variety of supervised tasks, and we observe in our experiments an order of magnitude speedup over the current state-of-the-art implementation, while achieving similar prediction performances.
The open-source C++ implementation of the proposed algorithm is available at https://github.com/xunzheng / light_medlda.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2015 LinearTimeSamplersforSupervised | Eric P. Xing Xun Zheng Yaoliang Yu | Linear Time Samplers for Supervised Topic Models Using Compositional Proposals | 10.1145/2783258.2783371 | 2015 |