Word Embedding Algorithm
Jump to navigation
Jump to search
A Word Embedding Algorithm is a text-item embedding algorithm that can generate vectorized word representations.
- AKA: Word Representation Learning Algorithm, Word Modelling Algorithm, Word Vector Generation Algorithm.
- Context:
- It can be implemented by a Word Embedding System to solve a Word Embedding Task.
- It can range from being an In-Vocabulary (IV) Embedding Algorithm to being an Out-Of-Vocabulary (OOV) Embedding Algorithm.
- It can range from being a Dense Continuous Word Modeling Algorithm to being Distributional Word Embedding Modeling Algorithm.
- ...
- Example(s):
- TensorFlow Word Embeddings coding tutorial(s):
- MIMICK Embedding Algorithm (Pinter et al., 2017),
- Polyglot Word Embedding Algorithm (Al-Rfou et al., 2013).
- Word2Vec Algorithm.
- …
- Counter-Example(s):
- See: Distributional Co-Occurrence Word Vector, Term Vector Space, Sentiment Analysis, Natural Language Processing, Language Model, Feature Learning, Vector (Mathematics), Real Numbers, Embedding, Vector Space, Neural Net Language Model, Dimensionality Reduction, co-Occurrence Matrix, Syntactic Parsing.
References
2021
- (Wikipedia, 2021) ⇒ https://en.wikipedia.org/wiki/Word_embedding Retrieved:2021-4-18.
- In natural language processing (NLP), Word embedding is a term used for the representation of words for text analysis, typically in the form of a real-valued vector that encodes the meaning of the word such that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using a set of language modeling and feature learning techniques where words or phrases from the vocabulary are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per word to a continuous vector space with a much lower dimension. Methods to generate this mapping include neural networks, dimensionality reduction on the word co-occurrence matrix, probabilistic models, explainable knowledge base method, and explicit representation in terms of the context in which words appear. Word and phrase embeddings, when used as the underlying input representation, have been shown to boost the performance in NLP tasks such as syntactic parsing and sentiment analysis.
2017
- (Pinter et al., 2017) ⇒ Yuval Pinter, Robert Guthrie, and Jacob Eisenstein. (2017). “Mimicking Word Embeddings Using Subword RNNs.” In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017).
- QUOTE: One of the key advantages of word embeddings for natural language processing is that they enable generalization to words that are unseen in labeled training data, by embedding lexical features from large unlabeled datasets into a relatively low-dimensional Euclidean space. These low-dimensional embeddings are typically trained to capture distributional similarity, so that information can be shared among words that tend to appear in similar contexts. .
2013
- (Al-Rfou et al., 2013) ⇒ Rami Al-Rfou, Bryan Perozzi, and Steven Skiena (2013). "Polyglot: Distributed word representations for multilingual NLP". In: Proceedings of the Conference on Natural Language Learning (CoNLL 2013).
- QUOTE: Distributed word representations (word embeddings) map the index of a word in a dictionary to a feature vector in high-dimension space. Every dimension contributes to multiple concepts, and every concept is expressed by a combination of subset of dimensions. Such mapping is learned by back-propagating the error of a task through the model to update random initialized embeddings. The task is usually chosen such that examples can be automatically generated from unlabeled data (i.e so it is unsupervised). In case of language modeling, the task is to predict the last word of a phrase that consists of n words.