2008 FastCollapsedGibbsSamplingforLa
- (Porteous et al., 2008) ⇒ Ian Porteous, David Newman, Alexander Ihler, Arthur Asuncion, Padhraic Smyth, and Max Welling. (2008). “Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation.” In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008). doi:10.1145/1401890.1401960
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%22Fast+collapsed+gibbs+sampling+for+latent+dirichlet+allocation%22+2008
- http://portal.acm.org/citation.cfm?doid=1401890.1401960&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
In this paper we introduce a novel collapsed Gibbs sampling method for the widely used latent Dirichlet allocation (LDA) model. Our new method results in significant speedups on real world text corpora. Conventional Gibbs sampling schemes for LDA require O(K) operations per sample where K is the number of topics in the model. Our proposed method draws equivalent samples but requires on average significantly less then K operations per sample. On real-word corpora FastLDA can be as much as 8 times faster than the standard collapsed Gibbs sampler for LDA. No approximations are necessary, and we show that our fast sampling scheme produces exactly the same results as the standard (but slower) sampling scheme. Experiments on four real world data sets demonstrate speedups for a wide range of collection sizes. For the PubMed collection of over 8 million documents with a required computation time of 6 CPU months for LDA, our speedup of 5.7 can save 5 CPU months of computation.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2008 FastCollapsedGibbsSamplingforLa | Padhraic Smyth Arthur Asuncion Ian Porteous David Newman Alexander Ihler Max Welling | Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation | 10.1145/1401890.1401960 |