2015 DiversifyingRestrictedBoltzmann
- (Xie et al., 2015) ⇒ Pengtao Xie, Yuntian Deng, and Eric Xing. (2015). “Diversifying Restricted Boltzmann Machine for Document Modeling.” In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2015). ISBN:978-1-4503-3664-2 doi:10.1145/2783258.2783264
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222015%22+Diversifying+Restricted+Boltzmann+Machine+for+Document+Modeling
- http://dl.acm.org/citation.cfm?id=2783258.2783264&preflayout=flat#citedby
Quotes
Author Keywords
- Data mining; diversified restricted boltzmann machine; diversity; document modeling; power-law distribution; topic modeling
Abstract
Restricted Boltzmann Machine (RBM) has shown great effectiveness in document modeling. It utilizes hidden units to discover the latent topics and can learn compact semantic representations for documents which greatly facilitate document retrieval, clustering and classification. The popularity (or frequency) of topics in text corpora usually follow a power-law distribution where a few dominant topics occur very frequently while most topics (in the long-tail region) have low probabilities. Due to this imbalance, RBM tends to learn multiple redundant hidden units to best represent dominant topics and ignore those in the long-tail region, which renders the learned representations to be redundant and non-informative. To solve this problem, we propose Diversified RBM (DRBM) which diversifies the hidden units, to make them cover not only the dominant topics, but also those in the long-tail region. We define a diversity metric and use it as a regularizer to encourage the hidden units to be diverse. Since the diversity metric is hard to optimize directly, we instead optimize its lower bound and prove that maximizing the lower bound with projected gradient ascent can increase this diversity metric. Experiments on document retrieval and clustering demonstrate that with diversification, the document modeling power of DRBM can be greatly improved.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2015 DiversifyingRestrictedBoltzmann | Eric P. Xing Pengtao Xie Yuntian Deng | Diversifying Restricted Boltzmann Machine for Document Modeling | 10.1145/2783258.2783264 | 2015 |