Word-Word Co-Occurrence Matrix
Jump to navigation
Jump to search
A Word-Word Co-Occurrence Matrix is a lexical co-occurrence matrix for word-word co-occurrences (often from a word co-occurrence dataset).
- Context:
- It can range from being a Raw Word-Word Co-Occurrence Matrix to being a Weighted Word-Word Co-Occurrence Matrix (such as a word-word PMI matrix).
- Example(s):
- Counter-Example(s):
- a Word-Context Window Co-Occurrence Matrix, such as ...
- a Word-Document Co-Occurrence Matrix, such as ...
- See: Distributional Semantic Modeling Algorithm, GloVe Algorithm.
References
2015
- (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/word_embedding Retrieved:2015-1-31.
- … There are several methods for generating this mapping. They include neural networks, dimensionality reduction on the word co-occurrence matrix, and explicit representation in terms of the context in which words appear. ...
2014
- (Pennington et al., 2014) ⇒ Jeffrey Pennington, Richard Socher, and Christopher D. Manning. (2014). “GloVe: Global Vectors for Word Representation.” In: Proceedings of EMNLP 2014.
- QUOTE: .. The result is a new global log-bilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word co-occurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus.
- Dec-23-2014 http://radimrehurek.com/2014/12/making-sense-of-word2vec/
- QUOTE: ... Basically, where GloVe precomputes the large word x word co-occurrence matrix in memory and then quickly factorizes it, word2vec sweeps through the sentences in an online fashion, handling each co-occurrence separately. So, there is a tradeoff between taking more memory (GloVe) vs. taking longer to train (word2vec). Also, once computed, GloVe can re-use the co-occurrence matrix to quickly factorize with any dimensionality, whereas word2vec has to be trained from scratch after changing its embedding dimensionality.
2010
- (Momtazi et al., 2010) ⇒ Saeedeh Momtazi, Sanjeev Khudanpur, and Dietrich Klakow. (2010). “A Comparative Study of Word Co-occurrence for Term Clustering in Language Model-based Sentence Retrieval.” In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. ISBN:1-932432-65-5
2007
- (Scovanner et al., 2007) ⇒ Paul Scovanner, Saad Ali, and Mubarak Shah. (2007). “A 3-Dimensional Sift Descriptor and its Application to Action Recognition.” In: Proceedings of the 15th International Conference on Multimedia.
- QUOTE: ... more discriminative action video representation. The test for finding word co-occurrences is carried out as follows. We construct a word co-occurrence matrix and populate it using frequency histograms of videos. If the size of the ...