Word Sense Inventory

Context:
- It can be an Input to a Word Sense Normalization Task.
- It can range from being a Common Word Inventory to being a Technical Term Inventory.
- It can be a Subset of a Lexical Database.
- It can be associated to an Ontology.
Example(s):
- A Lexical Database, such as WordNet.
- an Ontology, if its Ontology Records are associated with Word Forms.
- …
Counter-Example(s):
- A Thesaurus.
See: Sense-Tagged Corpora, Entity Record Set.

References

http://www.scholarpedia.org/article/Word_sense_disambiguation
- A task-independent sense inventory is not a coherent concept: each task requires its own division of word meaning into senses relevant to the task. For example, the ambiguity of mouse (animal or device) is not relevant in English-French machine translation, but is relevant in information retrieval. The opposite is true of river, which requires a choice in French (fleuve 'flows into the sea', or rivière 'flows into a river').

(MohammadH, 2006) ⇒ Saif Mohammad, and Graeme Hirst. (2006). “Determining Word Sense Dominance Using a Thesaurus.” In: Proceedings of EACL-2006.
- While other sense inventories such as WordNet exist, use of a published thesaurus has three dis tinct advantages: (i) coarse senses — it is widely believed that the sense distinctions of WordNet are far too fine-grained (Agirre and Lopez de Lacalle Lekuona (2003) and citations therein); (ii) computational ease — with just around a thousand categories, the word–category matrix has a manageable size; (iii) widespread availability — thesauri are available (or can be created with relatively less effort) in numerous languages, while WordNet is available only for English and a few romance languages. We use the Macquarie Thesaurus (Bernard, 1986) for our experiments. It consists of 812 categories with around 176,000 c-terms and 98,000 word types. Note, however, that using a sense inventory other than WordNet will mean that we cannot directly compare performance with McCarthy et al. (2004), as that would require knowing exactly how thesaurus senses map to WordNet. Further, it has been argued that such a mapping across sense inventories is at best difficult and maybe impossible (Kilgarriff and Yallop (2001) and citations therein)