Bag-of-Words Vector Space
(Redirected from Token Feature Space)
Jump to navigation
Jump to search
A Bag-of-Words Vector Space is a text-item vector space that is restricted to bag-of-word vectors.
- AKA: BoW Vector Space.
- Context:
- It can be (typically) defined by a Bag-of-Words Model (based on some corpus).
- Example(s):
- …
- Counter-Example(s):
- See: Document Vector, Metric Space, TF-IDF.
References
2021
- (Wikipedia, 2021) ⇒ https://en.wikipedia.org/wiki/Bag-of-words_model Retrieved:2021-4-18.
- The bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and even word order but keeping multiplicity. The bag-of-words model has also been used for computer vision.[1]
The bag-of-words model is commonly used in methods of document classification where the (frequency of) occurrence of each word is used as a feature for training a classifier. [2] An early reference to "bag of words" in a linguistic context can be found in Zellig Harris's 1954 article on Distributional Structure.
- The bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and even word order but keeping multiplicity. The bag-of-words model has also been used for computer vision.[1]