Word Sense Inventory
Jump to navigation
Jump to search
A Word Sense Inventory is a lexical database of word sense inventory records.
- Context:
- It can be an Input to a Word Sense Normalization Task.
- It can range from being a Common Word Inventory to being a Technical Term Inventory.
- It can be a Subset of a Lexical Database.
- It can be associated to an Ontology.
- Example(s):
- A Lexical Database, such as WordNet.
- an Ontology, if its Ontology Records are associated with Word Forms.
- …
- Counter-Example(s):
- A Thesaurus.
- See: Sense-Tagged Corpora, Entity Record Set.
References
2009
- http://www.scholarpedia.org/article/Word_sense_disambiguation
- A task-independent sense inventory is not a coherent concept: each task requires its own division of word meaning into senses relevant to the task. For example, the ambiguity of mouse (animal or device) is not relevant in English-French machine translation, but is relevant in information retrieval. The opposite is true of river, which requires a choice in French (fleuve 'flows into the sea', or rivière 'flows into a river').
2006
- (MohammadH, 2006) ⇒ Saif Mohammad, and Graeme Hirst. (2006). “Determining Word Sense Dominance Using a Thesaurus.” In: Proceedings of EACL-2006.
- While other sense inventories such as WordNet exist, use of a published thesaurus has three dis tinct advantages: (i) coarse senses — it is widely believed that the sense distinctions of WordNet are far too fine-grained (Agirre and Lopez de Lacalle Lekuona (2003) and citations therein); (ii) computational ease — with just around a thousand categories, the word–category matrix has a manageable size; (iii) widespread availability — thesauri are available (or can be created with relatively less effort) in numerous languages, while WordNet is available only for English and a few romance languages. We use the Macquarie Thesaurus (Bernard, 1986) for our experiments. It consists of 812 categories with around 176,000 c-terms and 98,000 word types. Note, however, that using a sense inventory other than WordNet will mean that we cannot directly compare performance with McCarthy et al. (2004), as that would require knowing exactly how thesaurus senses map to WordNet. Further, it has been argued that such a mapping across sense inventories is at best difficult and maybe impossible (Kilgarriff and Yallop (2001) and citations therein)
2003
- (Patwardhan et al., 2003) ⇒ Siddharth Patwardhan, Satanjeev Banerjee, and Ted Pedersen. (2003). “Using Measures of Semantic Relatedness for Word Sense Disambiguation.” In: Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2003).
- Word sense disambiguation is the process of assigning a meaning to a word based on the context in which it occurs. The most appropriate meaning for a word is selected from a predefined set of possibilities, usually known as a sense inventory.