Distributional Lexical Semantics Heuristic
Jump to navigation
Jump to search
Distributional Lexical Semantics Heuristic is a lexical semantics heuristic (that words are semantically similar) to the extent to which they share syntactic contexts (similar word distributions).
- AKA: Harris' Distributional Hypothesis.
- Context:
- It was originally proposed in (Harris, 1954).
- It can be proposed by a Distributional Semantics Theory.
- It can be used by a Distributional Lexical Semantics Modeling Algorithm.
- …
- Example(s):
- an n-Gram Language Model, where each word depends probabilistically on the [math]\displaystyle{ n-1 }[/math] preceding words [math]\displaystyle{ P(w_1...w_n) = P(w_i | w_{i-n+1}...w_{i-1}) }[/math].
- …
- Counter-Example(s):
- See: Text Window, Word Vector, Distributional Semantics.
References
2014
- (Goldberg & Levy, 2014) ⇒ Yoav Goldberg, and Omer Levy. (2014). “word2vec Explained: Deriving Mikolov Et Al.'s Negative-sampling Word-embedding Method.” In: arXiv preprint arXiv:1402.3722.
- QUOTE: Why does this produce good word representations? We don’t really know. The distributional hypothesis states that words in similar contexts have similar meanings. The objective above clearly tries to increase the quantity [math]\displaystyle{ v_w · v_c }[/math] for good word-context pairs, and decrease it for bad ones. Intuitively, this means that words that share many contexts will be similar to each other (note also that contexts sharing many words will also be similar to each other).
- (Baroni et al., 2014) ⇒ Marco Baroni, Georgiana Dinu, and Germán Kruszewski. (2014). “Don't Count, Predict! a Systematic Comparison of Context-counting Vs. Context-predicting Semantic Vectors."
- QUOTE: A long tradition in computational linguistics has shown that contextual information provides a good approximation to word meaning, since semantically similar words tend to have similar contextual distributions (Miller & Charles, 1991).
2008
- http://www.aclweb.org/aclwiki/index.php?title=Distributional_Hypothesis
- QUOTE: The Distributional Hypothesis is that words that occur in the same contexts tend to have similar meanings (Harris, 1954). The underlying idea that "a word is characterized by the company it keeps" was popularized by Firth (1957), and it is implicit in Weaver's (1955) discussion of word sense disambiguation (originally written as a memorandum, in 1949). The Distributional Hypothesis is the basis for Statistical Semantics. Although the Distributional Hypothesis originated in Linguistics, it is now receiving attention in Cognitive Science (McDonald and Ramscar, 2001). The origin and theoretical basis of the Distributional Hypothesis is discussed by Sahlgren (2008).
2005
- (Cimiano & Völker, 2005) ⇒ Philipp Cimiano, and Johanna Völker. (2005). “Towards Large-scale, Open-domain and Ontology-based Named Entity Classification.” In: Proceedings of RANLP-2005.
- QUOTE: In this paper we present an unsupervised approach which - as many others - is based on Harris' distributional hypothesis, i.e. that words are semantically similar to the extent to which they share syntactic contexts.
2002
- (Gleitman, 2002) ⇒ Lila R. Gleitman. (2002). “Verbs of a Feather Flock Together II: The child's discovery of words and their meanings". The Legacy of Zellig Harris: Language and information into the 21st century: Philosophy of science, syntax and semantics. Current issues in Linguistic Theory (John Benjamins Publishing Company) 1. doi:10.1075/cilt.228.17gle.
1991
- (Miller & Charles, 1991) ⇒ George A. Miller and Walter G. Charles. (1991) "Contextual Correlates of Semantic Similarity.” In: Language and Cognitive Processes, 6(1). doi:10.1080/01690969108406936
- ABSTRACT: The relationship between semantic and contextual similarity is investigated for pairs of nouns that vary from high to low semantic similarity. Semantic similarity is estimated by subjective ratings; contextual similarity is estimated by the method of sorting sentential contexts. The results show an inverse linear relationship between similarity of meaning and the discriminability of contexts. This relation, is obtained for two separate corpora of sentence contexts. It is concluded that, on average, for words in the same language drawn from the same syntactic and semantic categories, the more often two words can be substituted into the same contexts the more similar in meaning they are judged to be.
1965
- (Rubenstein & Goodenough, 1965).
- QUOTE: Words which are similar in meaning occur in similar contexts
1954
- (Harris, 1954) ⇒ Zellig Harris. (1954). “Distributional Structure." Word 10 (2/3)