Entity Mention Discovery Task
Jump to navigation
Jump to search
An Entity Mention Discovery Task is a Discovery Task that requires the Entity Mentions contained within a Text Document.
- AKA: Entity Discovery, Entity Discovery Task, Document Entity Mention Classification Task.
- Context:
- Input:
- an Entity Mention Set.
- a Text Document.
- output: zero or more Entity Mentions.
- Input:
- Example(s):
- [math]\displaystyle{ f }[/math]("I bought a PowerShot SX110 yesterday. I took some pictures in the evening in my living room. The images are very clear. They are definitely better than those from my old Polaroid a930. The battery is very good too.”) ⇒ {PowerShot SX110, Polaroid a930}
- See: Entity Mention, Discovery Task, Named Entity Recognition Task, Sentence Entity Mention Classification Task.
References
2009
- (Ding et al., 2009) ⇒ Xiaowen Ding, Bing Liu, and Lei Zhang. (2009). “Entity Discovery and Assignment for Opinion Mining Applications.” In: Proceedings of ACM SIGKDD Conference (KDD-2009). doi:10.1145/1557019.1557141
- Opinion mining became an important topic of study in recent years due to its wide range of applications. There are also many companies offering opinion mining services. One problem that has not been studied so far is the assignment of entities that have been talked about in each sentence.
- Related works about entity discovery is mainly in the field of named entity recognition (NER). NER aims to identify entities such as names of persons, organizations and locations in natural language text.
- Problem statement: Given a set of threads [math]\displaystyle{ T }[/math] in a particular domain, two tasks are performed in this paper:
- 1. Entity discovery: discover the set of entities $E$ discussed in the posts of the threads, and
- 2. Entity assignment: assign the entities in $E$ that each sentence si of each post pj in [math]\displaystyle{ t }[/math] (In T) talks about.
2004
- (Shinyama & Sekine, 2004) ⇒ Yusuke Shinyama, and Satoshi Sekine. (2004). “Named Entity Discovery Using Comparable News Articles.” In: Proceedings of the 20th International Conference on Computational Linguistics. doi:10.3115/1220355.1220477
- ABSTRACT: In this paper we describe a way to discover Named Entities by using the distribution of words in news articles. Named Entity recognition is an important task for today's natural language applications, but it still suffers from data sparseness. We used an observation that a Named Entity is likely to appear synchronously in several news articles, whereas a common noun is less likely. Exploiting this characteristic, we successfully obtained rare Named Entities with 90% accuracy just by comparing time series distributions of a word in two newspapers. Although the achieved recall is not sufficient yet, we believe that this method can be used to strengthen the lexical knowledge of a Named Entity tagger.