CoNLL-2000 Shared Task

Context:
- It introduced the CoNLL-2000 Text String Labeled Segmentation Format (which is an input to the CoNLL-2000 evaluation script).
- …
Counter-Example(s):
- CoNLL-2003 Shared Task.
See: YamCha System.

References

http://www.cnts.ua.ac.be/conll2000/chunking/
- QUOTE:Syntactic-Phase Text chunking consists of dividing a text in syntactically correlated parts of words. … It was the shared task for CoNLL-2000. Training and test data for this task is available. This data consists of the same partitions of the Wall Street Journal corpus (WSJ) as the widely used data for noun phrase chunking: sections 15-18 as training data (211727 tokens) and section 20 as test data (47377 tokens). The annotation of the data has been derived from the WSJ corpus by a program written by Sabine Buchholz from Tilburg University, The Netherlands.
  The goal of this task is to come forward with machine learning methods which after a training phase can recognize the chunk segmentation of the test data as well as possible. The training data can be used for training the text chunker. The chunkers will be evaluated with the F rate, which is a combination of the precision and recall rates: F = 2*precision*recall / (recall+precision) [Rij79]. The precision and recall numbers will be computed over all types of chunks.
(Tjong Kim Sang & Buchholz, 2000) ⇒ Erik Tjong Kim Sang, and Sabine Buchholz. (2000). “Introduction to the CoNLL-2000 Shared Task: Chunking.” In: Proceedings of CoNLL-2000.