CoNLL-2000 Text String Labeled Segmentation Format
Jump to navigation
Jump to search
A CoNLL-2000 Text String Labeled Segmentation Format is a BIO-style Text String Labeled Segmentation Format introduced in CoNLL-2000 shared task.
- Context:
- It can be instantiated in a CoNLL-2000 Format File.
- See: Space-Separated Data File Format.
References
2008
- http://cogcomp.cs.illinois.edu/page/software_view/13
- … performance can be tested on test data labeled in the same format as the CoNLL 2000 corpus …
2003
- http://www.clips.ua.ac.be/conll2003/ner/
- Output example of the evaluation program for this shared task: conlleval. The example deals with text chunking, a task which uses the same output format as this named entity task. The program requires the output of the NER system for each word to be appended to the corresponding line in the test file, with a single space between the line and the output tag. Make sure you keep the empty lines in the test file otherwise the software may mistakingly regard separate entities as one big entity.
2000
- http://www.cnts.ua.ac.be/conll2000/chunking/output.html
- his is an output example for the Perl script conlleval, which can be used for measuring the performance of a system that has processed the CoNLL-2000 shared task data. The input of this script should consist of lines similar to the shared task data files.
Each line contains four symbols: the current word, its part-of-speech tag (POS), the chunk tag according to the corpus and the predicted chunk tag. Sentences have been separated by empty lines.
Here is an example:
- his is an output example for the Perl script conlleval, which can be used for measuring the performance of a system that has processed the CoNLL-2000 shared task data. The input of this script should consist of lines similar to the shared task data files.
Boeing NNP B-NP I-NP 's POS B-NP B-NP 747 CD I-NP I-NP jetliners NNS I-NP I-NP . . O O Rockwell NNP B-NP I-NP said VBD B-VP B-VP the DT B-NP B-NP agreement NN I-NP I-NP