2008 RulebasedSynonymsforEntityExtra
- (Ananthanarayanan et al., 2008) ⇒ Rema Ananthanarayanan, Vijil Chenthamarakshan, Prasad M Deshpande, and Raghuram Krishnapuram. (2008). “Rule based Synonyms for Entity Extraction from Noisy Text.” In: Proceedings of the second workshop on Analytics for noisy unstructured text data. doi:10.1145/1390749.1390756
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%22Rule+based+synonyms+for+entity+extraction+from+noisy+text%22+2008
- http://dl.acm.org/citation.cfm?doid=1390749.1390756&preflayout=flat#citedby
Quotes
Author Keywords
Named entity extraction, Product name extraction, Synonym generation
Abstract
Identification of named entities such as person, organization and product names from text is an important task in information extraction. In many domains, the same entity could be referred to in multiple ways due to variations introduced by different user groups, variations of spellings across regions or cultures, usage of abbreviations, typographical errors and other reasons associated with conventional usage. Identifying a piece of text as a mention of an entity in such noisy data is difficult, even if we have a dictionary of possible entities. Previous approaches treat the synonym problem as part entity disambiguation and use learning-based methods that use the context of the words to identify synonyms. In this paper, we show that existing domain knowledge, encoded as rules, can be used effectively to address the synonym problem to a considerable extent. This makes the disambiguation task simpler, without the need for much training data. We look at a subset of application scenarios in named entity extraction, categorize the possible variations in entity names, and define rules for each category. Using these rules, we generate synonyms for the canonical list and match these synonyms to the actual occurrence in the data sets. In particular, we describe the rule categories that we developed for several named entities and report the results of applying our technique of extracting named entities by generating synonyms for two different domains.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2008 RulebasedSynonymsforEntityExtra | Rema Ananthanarayanan Vijil Chenthamarakshan Prasad M Deshpande Raghuram Krishnapuram | Rule based Synonyms for Entity Extraction from Noisy Text | 10.1145/1390749.1390756 |