Latent Dirichlet Allocation Model Family

AKA: LDA, Latent Dirichlet Allocation Metamodel.
Context:
- It can be a Bayesian Topic Model designed as a latent semantic analysis model that
  - represents text documents as random mixtures over unobserved groups (latent topics characterized by a word tuples and word distribution).
  - generates each word [math]\displaystyle{ w }[/math] in document [math]\displaystyle{ d }[/math] by first sampling a topic [math]\displaystyle{ t }[/math] and then sampling a word from the Topic-Word Distribution of t.
- It can assume that the topic distribution has a Dirichlet prior.
- It can be a Constrained Latent Dirichlet Allocation Metamodel if the model includes constraints, such as Background Knowledge ((Andrzejewski et al., 2009))
- It can be trained by a Latent Dirichlet Allocation Model Training Algorithm.
- It can be used for a Topic Modeling Task.
- It can be used for a Topic Detection and Tracking Task.
Example(s):
- [1]
- a Hierarchical LDA Metamodel.
- an LDA Mixture Model.
- …
Counter-Example(s):
See: Probabilistic Graphical Model, Latent Semantic Analysis, Topic Modeling Algorithm.

References

(Wikipedia, 2011-Jun-22) ⇒ http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation
- In statistics, latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics. LDA is an example of a topic model and was first presented as a graphical model for topic discovery by David Blei, Andrew Ng, and Michael Jordan in 2002. … In LDA, each document may be viewed as a mixture of various topics. This is similar to probabilistic latent semantic analysis (pLSA), except that in LDA the topic distribution is assumed to have a Dirichlet prior. In practice, this results in more reasonable mixtures of topics in a document. It has been noted, however, that the pLSA model is equivalent to the LDA model under a uniform Dirichlet prior distribution.

(Blei, 2008) ⇒ David M. Blei. (2008). “Modeling Science." Presentation. April 17, 2008
(AlSumait et al., 2008) ⇒ Loulwah AlSumait, Daniel Barbará, and Carlotta Domeniconi. (2008). “On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking.” In: Proceedings of the Eighth IEEE International Conference on Data Mining (ICDM 2008) [doi:10.1109/ICDM.2008.140].
(Wallach, 2008) ⇒ Hanna M. Wallach. (2008). “Structured Topic Models for Language." Ph.D. Thesis, Newnham College, University of Cambridge.

(Steyvers & Griffiths, 2005) ⇒ Mark Steyvers, and Thomas L. Griffiths. (2005). “Probabilistic Topic Models.” In: Thomas K. Landauer (editor), D. Mcnamara (editor), S. Dennis (editor), and W. Kintsch (editor). “Latent Semantic Analysis: A Road to Meaning.” Laurence Erlbaum.