Entity Reference Grounding Task
An Entity Reference Grounding Task is a reference grounding task that involves the mapping of all entity references to a canonical entity references (when one exists).
- AKA: Entity Reference Normalization.
- Context:
- Input: an Artifact Set, e.g. a Corpora.
- Optionally: a requirement that the Entity References in each Artifact be Annotated in advance, e.g. by an Annotation Algorithm.
- Optionally: a requirement for an Entity Database.
- output: a Mapping of all Entity Reference that refer to the same Entity Referent.
- Optionally: a mapping to the corresponding Entity Record, if an Entity Database is provided.
- It can be solved by an Entity Reference Grounding System that applies an (Entity Reference Grounding Algorithm.
- …
- Input: an Artifact Set, e.g. a Corpora.
- Example(s):
- BioCreAtIve II - Task 2: Gene Normalization Task, an Entity Mention Normalization Task.
- an Entity Mention Normalization Task, if the Entity References are all Entity Mentions, and an Entity Database is provided.
- an Entity Record Grounding Task, if the Entity References are all Entity Records.
- …
- Counter-Example(s):
- See: Record Linkage, Ontology Population Task, Relation Reference Resolution Task, Entity Coreference Resolution.
References
2018
- (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Record_linkage#Entity_resolution Retrieved:2018-5-27.
- Entity resolution is an operational intelligence process, typically powered by an entity resolution engine or middleware, whereby organizations can connect disparate data sources with a view to understanding possible entity matches and non-obvious relationships across multiple data silos. It analyzes all of the information relating to individuals and/or entities from multiple sources of data, and then applies likelihood and probability scoring to determine which identities are a match and what, if any, non-obvious relationships exist between those identities.
Entity resolution engines are typically used to uncover risk, fraud, and conflicts of interest, but are also useful tools for use within customer data integration (CDI) and master data management (MDM) requirements. Typical uses for entity resolution engines include terrorist screening, insurance fraud detection, USA Patriot Act compliance, organized retail crime ring detection and applicant screening.
For example: Across different data silos – employee records, vendor data, watch lists, etc. – an organization may have several variations of an entity named ABC, which may or may not be the same individual. These entries may, in fact, appear as ABC1, ABC2, or ABC3 within those data sources. By comparing similarities between underlying attributes such as address, date of birth, or social security number, the user can eliminate some possible matches and confirm others as very likely matches.
Entity resolution engines then apply rules, based on common sense logic, to identify hidden relationships across the data. In the example above, perhaps ABC1 and ABC2 are not the same individual, but rather two distinct people who share common attributes such as address or phone number.
- Entity resolution is an operational intelligence process, typically powered by an entity resolution engine or middleware, whereby organizations can connect disparate data sources with a view to understanding possible entity matches and non-obvious relationships across multiple data silos. It analyzes all of the information relating to individuals and/or entities from multiple sources of data, and then applies likelihood and probability scoring to determine which identities are a match and what, if any, non-obvious relationships exist between those identities.
2017
- (Bhattacharya & Getoor, 2017) ⇒ Indrajit Bhattacharya and Lise Getoor (2017) "Entity Resolution". In: Sammut C., Webb G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA.
- QUOTE: A fundamental problem in data cleaning and integration (see Data Preparation) is dealing with uncertain and imprecise references to real-world entities. The goal of entity resolution is to take a collection of uncertain entity references (or references, in short) from a single data source or multiple data sources, discover the unique set of underlying entities, and map each reference to its corresponding entity. This typically involves two subproblems – identification of references with different attributes to the same entity and disambiguation of references with identical attributes by assigning them to different entities.