ANNIE System
(Redirected from ANNIE)
Jump to navigation
Jump to search
The ANNIE System is an Information Extraction System.
- Context:
- It is a component of the GATE System.
- It is a Rules-based Algorithm.
- It makes extensive use of a Dictionary/gazetteer.
- See: Named Entity Recognition System.
References
2009
- http://www.gate.ac.uk/ie/annie.html
- ANNIE. Annie — a robust cross-domain IE system.
- http://www.gate.ac.uk/ie/annie.html
- ANNIE relies on finite state algorithms and the JAPE language
- ANNIE pipeline http://www.gate.ac.uk/sale/tao/index.html#x1-2070021
- http://www.gate.ac.uk/sale/tao/annie.png
2007
- (Recupero, 2007) ⇒ Diego Reforgiato Recupero. (2007). “A New Unsupervised Method for Document Clustering by using WordNet Lexical and Conceptual Relations.” In: Information Retrieval (2007) 10:563–579.
2002
- (Cunningham et al., 2002) ⇒ Hamish Cunningham, Diana Maynard, Kalina Bontcheva, and Valentin Tablan. (2001). “GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications.” In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL 2002).
- QUOTE: Provided with GATE is a set of reusable processing resources for common NLP tasks. (None of them are definitive, and the user can replace and/or extend them as necessary.) These are packaged together to form ANNIE, A Nearly-New IE system, but can also be used individually or coupled together with new modules in order to create new applications. For example, many other NLP tasks might require a sentence splitter and POS tagger, but would not necessarily require resources more specific to IE tasks such as a named entity transducer. The system is in use for a variety of IE and other tasks, sometimes in combination with other sets of application-specific modules.
ANNIE consists of the following main processing resources: tokeniser, sentence splitter, POS tagger, gazetteer, finite state transducer (based on GATE’s built-in regular expressions over annotations language (Cunningham et al., 2002)), orthomatcher and coreference resolver. The resources communicate via GATE’s annotation API, which is a directed graph of arcs bearing arbitrary feature/value data, and nodes rooting this data into document content (in this case text).
- QUOTE: Provided with GATE is a set of reusable processing resources for common NLP tasks. (None of them are definitive, and the user can replace and/or extend them as necessary.) These are packaged together to form ANNIE, A Nearly-New IE system, but can also be used individually or coupled together with new modules in order to create new applications. For example, many other NLP tasks might require a sentence splitter and POS tagger, but would not necessarily require resources more specific to IE tasks such as a named entity transducer. The system is in use for a variety of IE and other tasks, sometimes in combination with other sets of application-specific modules.