word2vec Model Instance
(Redirected from word2vec vectorization model)
Jump to navigation
Jump to search
A word2vec Model Instance is a continuous dense distributional word vector space model produced by a word2vec system.
- AKA: word2vec Word Vector Space Model, word2vec WVSM.
- Context:
- It can (typically) include a word2vec Word Vectorizing Function (to create word2vec vectors).
- Example(s):
- the model created by v1 code on the 20 Newsgroups Corpus using settings
-cbow 1 -negative 25 -hs 0 -sample 1e-4 -threads 40 -binary 1 -iter 15 -window 8 -size 200
. - …
- the model created by v1 code on the 20 Newsgroups Corpus using settings
- Counter-Example(s):
- See: word2vec Distance Function, word2vec Analogy Function.
References
2017
- (Shu & Nakayama, 2017) ⇒ Raphael Shu, and Hideki Nakayama. (2017). “Compressing Word Embeddings via Deep Compositional Code Learning.” In: Proceedings of 5th International Conference on Learning Representations (ICLR-2017).
2014
- (Rei & Briscoe, 2014) ⇒ Marek Rei, and Ted Briscoe. (2014). “Looking for Hyponyms in Vector Space.” In: Proceedings of CoNLL-2014.
- QUOTE: The window-based, dependency-based and word2vec vector sets were all trained on 112M words from the British National Corpus, with preprocessing steps for lower-casing and lemmatising. Any numbers were grouped and substituted by more generic tokens.
2013
- https://code.google.com/p/word2vec/
- QUOTE: The word2vec tool takes a text corpus as input and produces the word vectors as output. It first constructs a vocabulary from the training text data and then learns vector representation of words. The resulting word vector file can be used as features in many natural language processing and machine learning applications.