One Sense per Discourse

AKA: One Sense per Collocation.
Example(s):
- the algorithm described in Gale et al. (1992),
- the algorithm described in Yarowsky (1993),
- …
Counter-Example(s):
- Lesk Algorithm,
- WordNet-based WSD Algorithm.
See: Word Sense, Discourse, Collocation, Word Sense Disambiguation, Natural Language Processing, Natural Language Understanding, Language Model, Word Sense Discrimination Algorithm, Entity Mention Recognition Algorithm.

References

(Yarowsky, 1993) ⇒ David Yarowsky. (1993). “One Sense per Collocation.” In: Proceedings of the Workshop on Human Language Technology. doi:10.3115/1075671.1075731
- QUOTE: Collocation means the co-occurrence of two words in some defined relationship. We look at several such relationships, including direct adjacency and first word to the left or right having a certain part-of-speech. We also consider certain direct syntactic relationships, such as verb/object, subject/verb, and adjective/noun pairs. It appears that content words (nouns, verbs, adjectives, and adverbs) behave quite differently from function words (other parts of speech); we make use of this distinction in several definitions of collocation.
  We will attempt to quantify the validity of the one-sense-per collocation hypothesis for these different collocation types. (...)
  The sense disambiguation algorithm used is quite straightforward. When based on a single collocation type, such as the object of the verb or word immediately to the left, the procedure is very simple. One identifies if this collocation type exists for the novel context and if the specific words found are listed in the table of probability distributions (as computed above). If so, we return the sense which was most frequent for that collocation in the training data. If not, we return the sense which is most frequent overall.
  ...

(Gale et al., 1992) ⇒ William A. Gale, Kenneth W. Church, and David Yarowsky (1992). “One Sense per Discourse.” In: Proceedings of the DARPA Speech and Natural Language Workshop.
- QUOTE: In conclusion, it appears that our hypothesis is correct; well-written discourses tend to avoid multiple senses of a polysemous word. This result can be used in two basic ways: (1) as an additional source of constraint for improving the performance of a word-sense disambiguation algorithm, and (2) as an aide in collecting annotated test materials for evaluating disamhiguation algorithms.