CoNLL-2000 Shared Task
(Redirected from CoNLL-2000 Benchmark Task)
Jump to navigation
Jump to search
A CoNLL-2000 Shared Task is a supervised syntactic-phrase text chunking CoNLL shared task that evaluated the performance of supervised syntactic-phrase text chunking systems.
- Context:
- It introduced the CoNLL-2000 Text String Labeled Segmentation Format (which is an input to the CoNLL-2000 evaluation script).
- …
- Counter-Example(s):
- See: YamCha System.
References
2000
- http://www.cnts.ua.ac.be/conll2000/chunking/
- QUOTE:Syntactic-Phase Text chunking consists of dividing a text in syntactically correlated parts of words. … It was the shared task for CoNLL-2000. Training and test data for this task is available. This data consists of the same partitions of the Wall Street Journal corpus (WSJ) as the widely used data for noun phrase chunking: sections 15-18 as training data (211727 tokens) and section 20 as test data (47377 tokens). The annotation of the data has been derived from the WSJ corpus by a program written by Sabine Buchholz from Tilburg University, The Netherlands.
The goal of this task is to come forward with machine learning methods which after a training phase can recognize the chunk segmentation of the test data as well as possible. The training data can be used for training the text chunker. The chunkers will be evaluated with the F rate, which is a combination of the precision and recall rates: F = 2*precision*recall / (recall+precision) [Rij79]. The precision and recall numbers will be computed over all types of chunks.
- QUOTE:Syntactic-Phase Text chunking consists of dividing a text in syntactically correlated parts of words. … It was the shared task for CoNLL-2000. Training and test data for this task is available. This data consists of the same partitions of the Wall Street Journal corpus (WSJ) as the widely used data for noun phrase chunking: sections 15-18 as training data (211727 tokens) and section 20 as test data (47377 tokens). The annotation of the data has been derived from the WSJ corpus by a program written by Sabine Buchholz from Tilburg University, The Netherlands.
- (Tjong Kim Sang & Buchholz, 2000) ⇒ Erik Tjong Kim Sang, and Sabine Buchholz. (2000). “Introduction to the CoNLL-2000 Shared Task: Chunking.” In: Proceedings of CoNLL-2000.