Entity Mention
An entity mention is a referring expression that is an entity referencer (whose referent is an entity).
- AKA: Markable Entity, Referring Expression, Discourse Entity, Entity Instance.
- Context:
- It must, like any text-based mention, have an Entity Mention Location (defined by a mention start and a mention end).
- It can range from being a Named Entity Mention, to being a Nominal Mention, to being a Pronoun Mention.
- It can range from being an Unnested Entity Mention to being a Nested Entity Mention.
- It can range from being a Single-Word Entity Mention to being a Multi-Word Entity Mention.
- It can range from being an Explicit Entity Mention, to being an Implicit Entity Mention, to being an Ambiguous Entity Mention.
- It can be classified into an Entity Type, such as Person or Algorithm (by an Entity Mention Classification System).
- It can be in an Entity Mention Relation (such as an identity relation) with another Entity Mention.
- It can have an Entity Referent that can be resolved by an Entity Mention Resolution System that applies an (Entity Mention Resolution Algorithm.
- It can be mapped to Coreferent Entity Mention (with the same entity referent) by an Entity Mention Coreference Resolution System.
- It can be mapped to a Canonical Entity Record (by an entity mention linking system is it is a linkable entity mention).
- It can range from being a Disambiguated Entity Mention to being an Undisambiguated Entity Mention.
- It can range from being an Unannotated Entity Mention to being an Annotated Entity Mention.
- It can be annotated by an Entity Mention Annotation Task.
- It can be detected by an Entity Mention Detection Task (that can be solved by an Entity Mention Detection System).
- It can be recognized by an Entity Mention Recognition Task (that can be solved by an Entity Mention Recognition System).
- It can belong to an Entity Mention Set.
- It can (typically) be a noun phrase, such as: a Definite Noun Phrase, a Demonstrative Noun Phrase, a Proper Name, an Appositive, a Modifying Sub–Noun Phrase, and a Pronoun.
- It can range from being a:
- a Proper Name Mention.
- a Definite Description Phrase Mention, e.g. 'the tallest man in the world'.
- a Demonstrative Term Mention, e.g. 'this man', 'that woman'
- a Pronoun Mention.
- It can range from being a Simple Entity Mention to being a Semi-Structured Entity Mention (e.g. a citation mention).
- Example(s):
- Named Entity Mention examples:
- “London” is a Location Mention in the noun phrase "The city of [London]".
- “Unicorn”, is a Fictional Entity Mention in the Prepositional Phrase "on a [Unicorn]".
- “Black-crowned Central American Squirrel Monkey” is a Long Entity Mention.
- “Finnegan, Henderson, Farabow, Garrett & Dunner, L.L.P.” is a Law Firm Mention.
- “Nokia N95 G8” a Product Entity Mention.
- “Cape of Good Hope, is a Location Mention.
- ??? “antidisestablishmentarianism” is a Concept Mention (that refers to a movement or ideology that opposes disestablishment).
- ??? “The biggest ocean on our planet”.
- a Citation Mention such as “In (Melli, 2007a) the approach is ...”.
- “Escherichia Coli”, a Technical Term Mention.
- Nominal Entity Mention examples:
- “people” and “Prime Ministers” refer to groups of people.
- “the camera”
- ?? Noun Phrase Entity Mention ??
- “a camera with five exposure settings”
- Pronoun Mention examples:
- "they", "them", and "us" refers to a group of people.
- Examples of Nested Entity Mentions:
- The noun phrase “The president of Ford” mentions a Person Entity but it also mentions an Organization Entity. It can be annotated as: "[PERSON The president of [ORGANIZATION Ford]]”.
- The noun phrase "The historian who taught herself Cobol" mentions the same Person Entity three times: 1) the entire phrase, and the two Pronouns “herself” and "who". It can be Annotated as "[PERSON1 The historian [PERSON1 who] taught [PERSON1 herself] Cobol]”.
- The Sentence "He is the man who killed the president of the United States.” mentions two People and one Semantic Relation, and has two Nested Entity Mention.
- The two Sentences "Bruce P. Smith is the Dean of The Faculty of Science. Dean Smith had been Associate Dean for Academic Affairs since April.” both have Nested Entity Mentions and is an example of challenging Coreference Resolution Task (because it includes the Honorific Dean). They can be annotated as: "[PERSON1 Bruce P. Smith] is [PERSON1 the Dean of [ORGANIZATION1 The Faculty of Science]]. [PERSON1 Dean Smith] had been [PERSON1 Associate Dean for [ORGANIZATION2 Academic Affairs]] since [TIME1 March]."
- an Abstract Concept Mention, as in: "The supervised algorithm was accurate".
- …
- Named Entity Mention examples:
- Counter-Example(s):
- an Entity Record.
- a Verb Mention, such as ate as used in the sentence "I ate cake."
- an Adverb Mention, such as "I quickly” ate the cake." is a reference to an action qualifier.
- a Verb Phrase Mention, such as "I quickly ate the cake.".
- an Entity Mention Qualifier, such as “big” as used in the phrase “the big elephant”.
- a Product Attribute Mention, such as “red” as used in the phrase “my red Nokia N95” is .
- See: Entity Memory.
References
2011
- (Cheng, 20011) ⇒ Tao Cheng. (2011). “Toward Entity-Aware Search." PhD Thesis, University of Illinois at Urbana-Champaign
- QUOTE: We use a prefix # sign (e.g.,
#phone
for phone entity) throughout the thesis to distinguish entities from keywords. Further, each entity type [math]\displaystyle{ E_i }[/math] is an set of entity instances that are extracted from the corpus, i.e., literal values of entity type [math]\displaystyle{ E_i }[/math] that occur somewhere in some document [math]\displaystyle{ d \in D }[/math]. We use [math]\displaystyle{ e_i }[/math] to denote an entity instance of entity type [math]\displaystyle{ E_i }[/math]. In the example of phone-number patterns, we may extract #phone = {“800-2017575”, “244-2919”, ...}
- QUOTE: We use a prefix # sign (e.g.,
2009
- WordNet.
- mention: make reference to.
- en.wiktionary.org/wiki/mention
- mention: A speaking or notice of anything, usually in a brief or cursory manner. Used especially in the phrase to make mention of; To speak of something
- (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Referring_expression
- A referring expression (RE), in linguistics, is any noun phrase, or surrogate for a noun phrase, whose function in a text (spoken, signed or written on a particular occasion) is to "pick out" an individual person, place, object, or a set of persons, places, objects, etc. The technical terminology for "pick out" differs a great deal from one school of linguistics to another. The most widespread term is probably refer, and a thing "picked out" is a referent, as for example in the work of John Lyons. In linguistics, the study of reference belongs to pragmatics, the study of language use, though it is also a matter of great interest to philosophers, especially those wishing to understand the nature of knowledge, perception and cognition more generally.
- The kinds of expressions which can refer (as so defined) are:
- (1) a noun phrase of any structure, such as: the taxi in The taxi's waiting outside; the apple on the table in Bring me the apple on the table; and those five boys in Those five boys were off school last week. In those languages which, like English, encode definiteness, REs are typically marked for definiteness. In the examples given, this is done by the definite article the or the demonstrative adjective, here those.
- (2) a noun-phrase surrogate, i.e. a pronoun, such as it in It's waiting outside and Bring me it; and they in They were off school last week. The referent of such a pronoun may vary according to context - e.g. the referent of me depends on who the speaker is - and this property is technically an instance of deixis.
- (3) a proper name, like Sarah, London, The Eiffel Tower, or The Beatles. The intimate link between proper names and type (1) REs is shown by the definite article that appears in many of them. In many languages this happens far more consistently than in English. Proper names are often taken to refer, in principle, to the same referent independently of the context in which the name is used and in all possible worlds, i.e. they are in Saul Kripke's terminology rigid designators.
- (Kulkarni et al., 2009) ⇒ Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, Soumen Chakrabarti. (2009). “Collective Annotation of Wikipedia Entities in Web Text.” In: Proceedings of ACM SIGKDD Conference (KDD-2009). doi:10.1145/1557019.1557073.
- To take the first step beyond keyword-based search toward entity-based search, suitable token spans ("spots") on documents must be identified as references to real-world entities from an entity catalog.
2008
- (Sarawagi, 2008) ⇒ Sunita Sarawagi. (2008). “Information extraction.” In: FnT Databases, 1(3).
- QUOTE:Entities are typically noun phrases and comprise of one to a few tokens in the unstructured text. The most popular form of entities is named entities like names of persons, locations, and companies as popularized in the MUC competitionMUC [57, 100], ACE [1, 159], and [[CoNLL [206] competition]]s.
- (Ding et al., 2008) ⇒ Xiaowen Ding, Bing Liu, and Philip S. Yu. (2008). “A Holistic Lexicon-based Approach to Opinion Mining.” In: Proceedings of the International Conference on Web Search and Web Data Mining (WSDM 2008).
- Definition (object): An object $O$ is an entity which can be a product, person, event, organization, or topic. It is associated with a pair, O: (T, A), where [math]\displaystyle{ T }[/math] is a hierarchy or taxonomy of components (or parts), sub-components, and so on, and [math]\displaystyle{ A }[/math] is a set of attributes of O. Each component has its own set of subcomponents and attributes.
Example 1: A particular brand of digital camera is an object. It has a set of components, e.g., lens, battery, etc., and also a set of attributes, e.g., picture quality, size, etc. The battery component also has its set of attributes, e.g., battery life, battery size, etc. Essentially, an object is represented as a tree. The root is the object itself. Each non-root node is a component or subcomponent of the object. Each link is a part-of relationship. Each node is also associated with a set of attributes. An opinion can be expressed on any node and any attribute of the node.
Example 2: Following Example 1, one can express an opinion on the camera (the root node), e.g., “I do not like this camera”, or on one of its attributes, e.g., “the picture quality of this camera is poor”. Likewise, one can also express an opinion on any one of the camera’s components or the attribute of the component.
To simplify our discussion, we use the word “features” to represent both components and attributes, which allows us to omit the hierarchy. Using features for products is also quite common in practice. For an ordinary user, it is probably too complex to use a hierarchical representation of features and opinions. We note that in this framework the object itself is also treated as a feature.
Let the review be r. In the most general case, r consists of a sequence of sentences r = <s1, s2, …, sm>.
- Definition (explicit and implicit feature):
- If a feature f appears in review r, it is called an explicit feature in r. If f does not appear in r but is implied, it is called an implicit feature in r.
- Example 3: “battery life” in the following sentence is an explicit feature:
- “The battery life of this camera is too short”.
- “Size” is an implicit feature in the following sentence as it does not appear in the sentence but it is implied:
- “This camera is too large”.
- Here, “large” is called a feature indicator.
- Definition (object): An object $O$ is an entity which can be a product, person, event, organization, or topic. It is associated with a pair, O: (T, A), where [math]\displaystyle{ T }[/math] is a hierarchy or taxonomy of components (or parts), sub-components, and so on, and [math]\displaystyle{ A }[/math] is a set of attributes of O. Each component has its own set of subcomponents and attributes.
2002
- (Soon et al., 2001) ⇒ Wee Meng Soon, Hwee Tou Ng, and Daniel Chung Yong Lim. (2001). “A Machine Learning Approach to Coreference Resolution of Noun Phrases.” In: Computational Linguistics, Vol. 27, No. 4.
- QUOTE:Specifically, a coreference relation denotes an identity of reference and holds between two textual elements known as markables, which can be definite noun phrases, demonstrative noun phrases, proper names, appositives, sub–noun phrases that act as modifiers, pronouns, and so on.
1982
- (Evans, 1982) ⇒ Gareth Evans. (1982). “The Varieties of Reference." Oxford University Press, (published posthumously, edited by John McDowell).
- QUOTE:The class of referring expressions has traditionally been taken to include proper names; definite descriptions ('the tallest man in the world'); demonstrative terms ('this man', 'that woman'); and some pronouns.