Text-Item Embedding Space
Jump to navigation
Jump to search
A Text-Item Embedding Space is a embedding space associated to a distributional text-item embedding function (that can map text-items into distributional text-item vectors within the space).
- Context:
- It can (typically) be created by a Distributional Text-Item Embedding Modeling System (that implements a distributional text-item embedding modeling algorithm).
- It can (typically) be a Lookup Table (composed of Distributional Text-Item Vectors).
- It can (typically) be associated to some Large Text Corpus.
- It can (typically) represent a Text Similarity Space.
- It can be an input to a Text Embedding-based Algorithm, such as a text embedding clustering algorithm.
- …
- Example(s):
- a Word Embedding Space, for word embeddings.
- a Character Embedding Space, for character embeddings.
- a Phrase Embedding Space, for phrase embeddings.
- a Sentence Embedding Space, for sentence embeddings.
- a Paragraph Embedding Space, for paragraph embeddings.
- a Document Embedding Space, for document embeddings.
- See: tf-idf Vector, Language Translation Model.
References
2018
- (Wolf, 2018b) ⇒ Thomas Wolf. (2018). “The Current Best of Universal Word Embeddings and Sentence Embeddings." Blog post
- QUOTE: Word and sentence embeddings have become an essential part of any Deep-Learning-based natural language processing systems. They encode words and sentences 📜 in fixed-length dense vectors 📐 to drastically improve the processing of textual data. A huge trend is the quest for Universal Embeddings: embeddings that are pre-trained on a large corpus and can be plugged in a variety of downstream task models (sentimental analysis, classification, translation…) to automatically improve their performance by incorporating some general word/sentence representations learned on the larger dataset. It’s a form of transfer learning. Transfer learning has been recently shown to drastically increase the performance of NLP models on important tasks such as text classification. …