Continuous Dense Distributional Word Model Training Algorithm
(Redirected from dense continuous word vector space modeling algorithm)
Jump to navigation
Jump to search
A Continuous Dense Distributional Word Model Training Algorithm is a text-item embedding algorithm that is a dense distributional model training algorithm that can be implemented into a continuous dense distributional word model training system (to solve a continuous dense distributional word model training task).
- AKA: Word Embedding Algorithm.
- Context:
- …
- Example(s):
- Counter-Example(s):
- See: Embedding Algorithm, SGNS Algorithm, Word Embeddings, Distributional Word Model Training Algorithm, Continuous Dense Word Model, Text Item, Dense Word Model Training Algorithm, Dense Word Model Training Algorithm, Distributional Word Model Training Algorithm.
References
2014
- (Levy & Goldberg, 2014) ⇒ Omer Levy, and Yoav Goldberg. (2014). “Neural Word Embedding As Implicit Matrix Factorization.” In: Advances in Neural Information Processing Systems.
- QUOTE: Recently, there has been a surge of work proposing to represent words as dense vectors, derived using various training methods inspired from neural-network language modeling [3, 9, 23, 21].
- (Mikolov et al., 2014) ⇒ Tomáš Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. (2014). “Distributed Representations of Words and Phrases and their Compositionality.” In: Advances in Neural Information Processing Systems, 26.
2013
- (Chelba et al., 2013) ⇒ Ciprian Chelba, Tomáš Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. (2013). “One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling." Technical Report, Google Research.
- (Mikolov et al., 2013a) ⇒ Tomáš Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. (2013). “Efficient Estimation of Word Representations in Vector Space." CoRR, abs/1301.3781, 2013.
- (Mikolov et al., 2013b) ⇒ Tomáš Mikolov, Wen-tau Yih, and Geoffrey Zweig. (2013). “Linguistic Regularities in Continuous Space Word Representations..” In: HLT-NAACL.
2003
- (Bengio et al., 2003a) ⇒ Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. (2003). “A Neural Probabilistic Language Model.” In: The Journal of Machine Learning Research, 3.
- QUOTE: A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen during training. Traditional but very successful approaches based on n-grams obtain generalization by concatenating very short overlapping sequences seen in the training set. We propose to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences.
1986
- (Hinton, 1986) ⇒ Geoffrey E. Hinton. (1986). “Learning Distributed Representations of Concepts.” In: Proceedings of the eighth annual conference of the cognitive science society.
- QUOTE: Concepts can be represented by distributed patterns of activity in networks of neuron-like units. One advantage of this kind of representation is that it leads to automatic generalization.