Distributional-based Word/Token Embedding Space

A Distributional-based Word/Token Embedding Space is a text-item embedding space for word vectors associated with a distributional word vectorizing function (which maps to distributional word vectors).

Context:
- It can be created by a Distributional Word Embedding Modeling System (that implements a distributional word embedding modeling algorithm).
- It can range from being a Closed Distributional Word Vector Space Model (that applies only to the words in the training data) to being an Open Distributional Word Vector Space Model.
- …
Example(s):
- a word2vec Vector Space Model, created by word2vec.
- a GloVe Vector Space Model, created by GloVe.
- a FastText Vector Space Model, created by FastText.
- a ELMo Vector Space Model, created by ELMo.
- a Word-Word PMI Matrix.
- …
Counter-Example(s):
See: Distributional Word Vectorizing Model, Lexical Co-Occurrence Matrix, Distributional Word Vector, Vector Space Model, Word Vector Space Mapping Function.

References

2018

(Wolf, 2018b) ⇒ Thomas Wolf. (2018). “The Current Best of Universal Word Embeddings and Sentence Embeddings." Blog post
- QUOTE: Word and sentence embeddings have become an essential part of any Deep-Learning-based natural language processing systems. They encode words and sentences 📜 in fixed-length dense vectors 📐 to drastically improve the processing of textual data. A huge trend is the quest for Universal Embeddings: embeddings that are pre-trained on a large corpus and can be plugged in a variety of downstream task models (sentimental analysis, classification, translation…) to automatically improve their performance by incorporating some general word/sentence representations learned on the larger dataset. It’s a form of transfer learning. Transfer learning has been recently shown to drastically increase the performance of NLP models on important tasks such as text classification. …
  … A wealth of possible ways to embed words have been proposed over the last five years. The most commonly used models are word2vec and GloVe which are both unsupervised approaches based on the distributional hypothesis (words that occur in the same contexts tend to have similar meanings). While several works augment these unsupervised approaches by incorporating the supervision of semantic or syntactic knowledge, purely unsupervised approaches have seen interesting developments in 2017–2018, the most notable being FastText (an extension of word2vec) and ELMo (state-of-the-art contextual word vectors).

2017a

(Ruder, 2017b) ⇒ Sebastian Ruder. (2017). “Word embeddings in 2017: Trends and future directions." Blog post
- QUOTE:

   Subword-level embeddings
   OOV handling
   Evaluation
   Multi-sense embeddings
   Beyond words as points
   Phrases and multi-word expressions
   Bias
   Temporal dimension
   Lack of theoretical understanding
Task and domain-specific embeddings
   Transfer learning
   Embeddings for multiple languages
   Embeddings based on other contexts

2017

(Yang, Lu & Zheng, 2017) ⇒ Wei Yang, Wei Lu, and Vincent Zheng. (2017). “A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings.” In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2898-2904.
- ABSTRACT: Learning word embeddings has received a significant amount of attention recently. Often, word embeddings are learned in an unsupervised manner from a large collection of text. The genre of the text typically plays an important role in the effectiveness of the resulting embeddings. How to effectively train word embedding models using data from different domains remains a problem that is underexplored. In this paper, we present a simple yet effective method for learning word embeddings based on text from different domains. We demonstrate the effectiveness of our approach through extensive experiments on various down-stream NLP tasks.

Distributional-based Word/Token Embedding Space

References

2018

2017a

2017

Navigation menu

Search