BART Coreference System
A BART Toolkit is a modular Coreference Resolution System that implements a variety of machine learning algorithms for solving Coreference Resolution Tasks.
- AKA: Beautiful Anaphora Resolution Toolkit.
- Context:
- It was developed by Versley et al. (2008).
- It consists of the following modules:
- a Preprocessing Module: It uses MMAX2 Annotation Tool (Muller & Strube, 2006) with MiniDiscourse and includes 3 NLP pipelines:
- Chunking pipeline - it implements a Stanford POS tagger (Toutanova et al.,2003), a YamCha Chunker (Kudoh & Matsumoto, 2000), and a Stanford Named Entity Recognizer (Finkel et al., 2005);
- Parsing Pipeline - it implements a Charniak and Johnson’s Reranking Parser (Charniak & Johnson, 2005);
- Carafe Pipeline - it implements a ACE Mention Tagger provided by MITRE (Wellner & Vilain, 2006));
- a Feature Extraction Module - it implements a Coreference Resolution Algorithm based on Soon et al. (2001);
- a Machine Learning Module - it implements several machine learning toolkits including WEKA (Witten & Frank, 2005), SVMLight (Joachims,1999), and SVMLight-TK (Moschitti, 2006), as well as a Maximum Entropy Classifier;
- a Training Module - it implements an Encode-Decoder Algorithm.
- a Preprocessing Module: It uses MMAX2 Annotation Tool (Muller & Strube, 2006) with MiniDiscourse and includes 3 NLP pipelines:
- …
- Example(s):
- Counter-Example(s):
- See: Entity Mention Normalization System, Natural Language Processing System, Information Extraction System, Anaphora Resolution System, Text Tokenization System, Sentence Segmentation System, Morphological Analysis System, Part-of-Speech Tagging System, Noun Phrase Identification System, Named Entity Recognition System.
References
2019
- (Bart-Coref, 2019) ⇒ http://www.bart-coref.org/ Retrieved: 2019-03-31.
- QUOTE: BART, the Beautiful Anaphora Resolution Toolkit, is a product of the project Exploiting Lexical and Encyclopedic Resources For Entity Disambiguation at the Johns Hopkins Summer Workshop 2007.
BART performs automatic coreference resolution, including all necessary preprocessing steps.
BART incorporates a variety of machine learning approaches and can use several machine learning toolkits, including WEKA and an included MaxEnt implementation.
- QUOTE: BART, the Beautiful Anaphora Resolution Toolkit, is a product of the project Exploiting Lexical and Encyclopedic Resources For Entity Disambiguation at the Johns Hopkins Summer Workshop 2007.
2008
- (Versley et al., 2008) ⇒ Yannick Versley, Simone Paolo Ponzetto, Massimo Poesio, Vladimir Eidelman, Alan Jern, Jason Smith, Xiaofeng Yang, and Alessandro Moschitti. (2008). “BART: A Modular Toolkit for Coreference Resolution.” In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Demo Session.
- QUOTE: Using the built-in maximum entropy learner with feature combination, BART reaches 65.8% F-measure on MUC6 and 62.9% F-measure on MUC7 using Soon et al.’s features, outperforming JAVARAP on pronoun resolution, as well as the Soon et al. reimplementation of Uryupina (2006). Using a specialized tagger for ACE mentions and an extended feature set including syntactic features (e.g. using tree kernels to represent the syntactic relation between anaphor and antecedent, cf. Yang et al. 2006), as well as features based on knowledge extracted from Wikipedia (cf. Ponzetto and Smith, in preparation), BART reaches state-of-the-art results on ACE-2(...)
The BART toolkit has been developed as a tool to explore the integration of knowledge-rich features into a coreference system at the Johns Hopkins Summer Workshop 2007. It is based on code and ideas from the system of Ponzetto and Strube (2006), but also includes some ideas from GUITAR (Steinberger et al., 2007) and other coreference systems (Versley, 2006; Yang et al., 2006)[1].
Figure 2: Example system configuration
- QUOTE: Using the built-in maximum entropy learner with feature combination, BART reaches 65.8% F-measure on MUC6 and 62.9% F-measure on MUC7 using Soon et al.’s features, outperforming JAVARAP on pronoun resolution, as well as the Soon et al. reimplementation of Uryupina (2006). Using a specialized tagger for ACE mentions and an extended feature set including syntactic features (e.g. using tree kernels to represent the syntactic relation between anaphor and antecedent, cf. Yang et al. 2006), as well as features based on knowledge extracted from Wikipedia (cf. Ponzetto and Smith, in preparation), BART reaches state-of-the-art results on ACE-2(...)
- ↑ An open source version of BART is available from http://www.sfs.uni-tuebingen.de/˜versley/BART/.