Named Entity Recognition (NER) System

Context:
- It can range from being a Heuristic NER System to being a Data-Driven NER System.
- It can be supported by an Entity Mention Detection System and an Entity Mention Type Classification System.
Example(s):
- a Rule-based NER System, such as an ANNIE System or a LingPipe System.
- a Dictionary-based NER System, such as an ANNIE System or a LingPipe System.
- a Statistical NER System, such as a LingPipe System or a Stanford Named Entity Recognizer System.
- a UIUC NER System, such as: http://cogcomp.cs.illinois.edu/page/demo_view/NERextended
- a Domain-Specific NER System, such as a ProMiner and AbGene protein NER system.
- …
Counter-Example(s):
See: Recognizer, NER Model.

References

https://github.com/glample/tagger
- QUOTE: … NER Tagger is an implementation of a Named Entity Recognizer that obtains state-of-the-art performance in NER on the 4 CoNLL datasets (English, Spanish, German and Dutch) without resorting to any language-specific knowledge or resources such as gazetteers. Details about the model can be found at: http://arxiv.org/abs/1603.01360 ...

(Wikipedia, 2011) ⇒ http://en.wikipedia.org/wiki/Named_entity_recognition#Approaches
- QUOTE: NER systems have been created that use linguistic grammar-based techniques as well as statistical models. Hand-crafted grammar-based systems typically obtain better precision, but at the cost of lower recall and months of work by experienced computational linguists. Statistical NER systems typically require a large amount of manually annotated training data.

(Lingpipe, 2009) ⇒ LingPipe System http://alias-i.com/lingpipe/web/demo-ne.html
- QUOTE: Named entity recognition finds mentions of things in text. … Named entity recognizers in LingPipe are trained from a corpus of data. The examples below extract mentions of people, locations or organizations in English news texts, and mentions of genes and other biological entities of interest in biomedical research literature. … LingPipe provides three statistical named-entity recognizers: TokenShapeChunker, CharLmHmmChunker, CharLmRescoringChunker.
  … Running this program on the same input provides the following results … The entity recognizer is 99.99% confident that p53 is a mention of a gene. … These confidences reflect the uncertainty of the recognizers. … The first requirement for training a named entity recognizer is gathering the data.

(Nadeau & Sekine, 2007) ⇒ David Nadeau, and Satoshi Sekine. (2007). “A Survey of Named Entity Recognition and Classification.” In: Lingvisticae Investigationes, 30(1).
- QUOTE: The ability to recognize previously unknown entities is an essential part of NERC systems. Such ability hinges upon recognition and classification rules triggered by distinctive features associated with positive and negative examples. While early studies were mostly based on handcrafted rules, most recent ones use supervised machine learning (SL) as a way to automatically induce rule-based systems or sequence labeling algorithms starting from a collection of training examples. This is evidenced, in the research community, by the fact that five systems out of eight were rule-based in the MUC-7 competition while sixteen systems were presented at CONLL-2003, a forum devoted to learning techniques. When training examples are not available, handcrafted rules remain the preferred technique, as shown in S. Sekine and Nobata (2004) who developed a NERC system for 200 entity types.

(Roth & Yih, 2002) ⇒ Dan Roth, and Wen-tau Yih. (2002). “Probabilistic Reasoning for Entity & Relation Recognition.” In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2002).
- QUOTE: In all earlier works we know of, the tasks of identifying entities and relations were treated as separate problems. The common procedure is to first identify and classify entities using a named entity recognizer and only then determine the relations between the entities. However, this approach has several problems. First, errors made by the named entity recognizer propagate to the relation classifier and may degrade its performance significantly. For example, if “Boston” is mislabeled as a person, it will never be classified as the location of Poe’s birthplace. Second, relation information is sometimes crucial to resolving ambiguous named entity recognition. For instance, if the entity “JFK” is identified as the victim of the assassination, the named entity recognizer is unlikely to misclassify it as a location (e.g. JFK airport).

(Cucerzan & Yarowsky) ⇒ Silviu Cucerzan, and David Yarowsky. (1999). “Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence.” In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP-VLC 1999).
- QUOTE: This paper has presented an algorithm for the minimally supervised learning of named entity recognizers given short name lists as seed data (typically 40-100 example words per entity class).