Word Embedding Task

AKA: Word Representation Learning Task, Word Modelling Task, Word Vector Generation Task.
Context:
- Task Input: Text Document.
- Task Output: Vectorized Word Representation.
- Task Requirement(s): Language Model, Training Corpus.
- It can be solve a Word Embedding System by implementing a Word Embedding Algorithm.
- It can range from being an In-Vocabulary (IV) Embedding Task to being an Out-Of-Vocabulary (OOV) Embedding Task.
- It can range from being a Dense Continuous Word Modeling Task to being Distributional Word Embedding Modeling Task.
Example(s):
- MIMICK Embedding Task (Pinter et al., 2017),
- Polyglot Word Embedding Task (Al-Rfou et al., 2013).
- Word2Vec Task.
- …
Counter-Example(s):
See: Distributional Co-Occurrence Word Vector, Term Vector Space, Sentiment Analysis, Natural Language Processing, Language Model, Feature Learning, Vector (Mathematics), Real Numbers, Embedding, Vector Space, Neural Net Language Model, Dimensionality Reduction, co-Occurrence Matrix, Syntactic Parsing.

References

(Wikipedia, 2021) ⇒ https://en.wikipedia.org/wiki/Word_embedding Retrieved:2021-4-18.
- In natural language processing (NLP), Word embedding is a term used for the representation of words for text analysis, typically in the form of a real-valued vector that encodes the meaning of the word such that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using a set of language modeling and feature learning techniques where words or phrases from the vocabulary are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per word to a continuous vector space with a much lower dimension. Methods to generate this mapping include neural networks, dimensionality reduction on the word co-occurrence matrix, probabilistic models, explainable knowledge base method, and explicit representation in terms of the context in which words appear. Word and phrase embeddings, when used as the underlying input representation, have been shown to boost the performance in NLP tasks such as syntactic parsing and sentiment analysis.