1994 CoOccurrenceVectorsfromCorporaV
- (Niwa & Nitta, 1994) ⇒ Yoshiki Niwa, and Yoshihiko Nitta. (1994). “Co-occurrence Vectors from Corpora Vs. Distance Vectors from Dictionaries.” In: Proceedings of the 15th conference on Computational linguistics - Volume 1. doi:10.3115/991886.991938
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%221994%22+Co-occurrence+Vectors+from+Corpora+Vs.+Distance+Vectors+from+Dictionaries
- http://dl.acm.org/citation.cfm?id=991886.991938&preflayout=flat#citedby
Quotes
Abstract
A comparison was made of vectors derived by using ordinary co-occurrence statistics from large text corpora and of vectors derived by measuring the interword distances in dictionary definitions. The precision of word sense disambiguation by using co-occurrence vectors from the 1987 Wall Street Journal (20M total words) was higher than that by using distance vectors from the Collins English Dictionary (60K head words + 1.6M definition words). However, other experimental results suggest that distance vectors contain some different semantic information from co-occurrence vectors.
1. Introduction
…
3 Co-occurrence Vectors
We use ordinary co-occurrence statistics and measure the co-occurrence likelihood between two words, X and Y, by the mutual information estimate (Church and Hanks, 1989): :[math]\displaystyle{ I(\mathbf{X},\mathbf{Y}) = \log^+ \frac{P(\mathbf{X} \mid \mathbf{Y})}{P(\mathbf{X}}) }[/math], where P(X) is the occurrence density of word X in a whole corpus, and the conditional probability [math]\displaystyle{ P(\mathbf{X} \mid \mathbf{Y}) }[/math] is the density of word X in a neighborhood of word Y. Here the neighborhood is defined as 50 words before or after any appearance of word Y. (There is a variety of neighborhood definitions such as "100 surrounding words" (Yarowsky 1992) and "within a distance of no more thall 3 words ignoring function words" (Dagan et al., 1993).)
…
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
1994 CoOccurrenceVectorsfromCorporaV | Yoshiki Niwa Yoshihiko Nitta | Co-occurrence Vectors from Corpora Vs. Distance Vectors from Dictionaries | 10.3115/991886.991938 | 1994 |