Classification-based Coreference Resolution System

A Classification-based Coreference Resolution System is a Supervised Coreference Resolution System that based on a two-step procedure in which it implements a classification algorithm and a clustering algorithm.

AKA: Classifier-based Coreference Resolution System, Two-Step Classification Coreference Resolution System, Binary Classification Coreference Resolution System.
Context:
- It can solve a Classification-based Coreference Resolution Task by implementing a Classification-based Coreference Resolution Algorithm.
- …
Example(s):
Counter-Example(s):
See: Coreference Resolution System, Classification Task, Clustering Task, Entity Mention Normalization System, Natural Language Processing System, Information Extraction System.

References

2015

(Sawhney & Wang, 2015) ⇒ Kartik Sawhney, and Rebecca Wang. (2015). “Coreference Resolution.”
- QUOTE: In this section, we discuss the features we added to our statistical classifier and the effect of each. The classifier extracts these features from a set of examples of coreferent pairs (both positive and negative), and then assigns weights to these features during training, which can then be used to classify new examples in the dev/test sets as coreferent or not coreferent.

2011a

(Zheng et al., 2011) ⇒ Jiaping Zheng, Wendy W. Chapman, Rebecca S. Crowley, and Guergana K. Savova. (2011). “Coreference Resolution: A Review of General Methodologies and Applications in the Clinical Domain.” In: Journal of Biomedical Informatics, 44(6). doi:10.1016/j.jbi.2011.08.006
- In the mid-1990s, methods for performing supervised coreference resolution sprang up. The widespread availability of the MUC and ACE corpora further shaped the research community to move towards statistical approaches. Complete heuristics-based systems gradually saw a decline of interest in the community, although isolated rules are still employed to encode hard linguistic constraints. Two types of machine learning methods emerged—a two-step binary classification followed by clustering and a ranking approach. The key distinction between them is that the binary classification approach makes coreference decisions on the antecedent candidates independently of each other, while the ranking approach takes into account other antecedent candidates. (...)
  The binary classification approach involves two steps. First, for a given anaphor, the classifier determines for each candidate antecedent whether the anaphor corefers with the antecedent. A clustering algorithm then takes these pairwise coreference decisions and generates a partition of the set of all markables in the document, such that all the markables in each partition refer to the same entity. This process is named the “mention-pair” model, since it hinges on a pair of markables (mentions).
  A different but similar approach is the “entity-mention” model. It also casts the task as a binary classification problem, except that the classifier predicts whether a markable is coreferent with a partially-formed entity (chain), instead of a single markable as in the “mention-pair” model. The second clustering step proceeds in an analogous manner.

2011b

(Klenner et al., 2011) ⇒ Manfred Klenner, and Don Tuggener. (2011). “An Incremental Model for Coreference Resolution with Restrictive Antecedent Accessibility.” In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task.
- QUOTE: As recently discussed in (Ng, 2010), the so called mention-pair model suffers from several design flaws which originate from the locally confined perspective of the model:
  - Generation of (transitively) redundant pairs, as the formation of coreference sets (coreference clustering) is done after pairwise classification;
  - Thereby generation of skewed training sets which lead to classifiers biased towards negative classification;
  - No means to enforce global constraints such as transitivity;
  - Underspecification of antecedent candidates;

These problems can be remedied by an incremental entity-mention model, where candidate pairs are evaluated on the basis of the emerging coreference sets. A clustering phase on top of the pairwise classifier no longer is needed and the number of candidate pairs is reduced, since from each coreference set (be it large or small) only one mention (the most representative one) needs to be compared to a new anaphor candidate. We form a ’virtual prototype’ that collects information from all the members of each coreference set in order to maximize ’representativeness’. Constraints such as transitivity and morphological agreement can be assured by just a single comparison. If an anaphor candidate is compatible with the virtual prototype, then it is by definition compatible with all members of the coreference set.

2010a

(Ng, 2010) ⇒ Vincent Ng. (2010). “Supervised Noun Phrase Coreference Research: The First Fifteen Years.” In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010).
- QUOTE: Noun phrase (NP) coreference resolution, the task of determining which NPs in a text or dialogue refer to the same real-world entity, has been at the core of natural language processing (NLP) since the 1960s. NP coreference is related to the task of anaphora resolution, whose goal is to identify an antecedent for an anaphoric NP (...)
  (...) we examine three important classes of coreference models that were developed in the past fifteen years, namely, the mention-pair model, the entity-mention model, and ranking models.

2010b

(Uryupina, 2010) ⇒ Olga Uryupina. (2010). “Corry: A System for Coreference Resolution.” In: Proceedings of the 5th International Workshop on Semantic Evaluation.

2009

(Wick et al., 2009) ⇒ Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew McCallum. (2009). “An Entity Based Model for Coreference Resolution.” In: Proceedings of the SIAM International Conference on Data Mining (SDM 2009).
- QUOTE: Over the past several years, increasingly powerful supervised machine learning techniques have been developed to solve this problem. Initial solutions treated it as a set of independent binary classifications, one for each pair of mentions [1, 2]. Next, relational probability models were developed to capture the dependency between each of these classifications [3, 4]; however the parameterization of these methods still consists of features over pairs of mentions. Finally, methods have been developed to enable arbitrary features over entire clusters of mentions [5, 6, 7].

Classification-based Coreference Resolution System

References

2015

2011a

2011b

2010a

2010b

2009

2008a

2008b

2007

2004

2002a

2002b

2001

1995

Navigation menu

Search