2003 GENIAcorpus
Jump to navigation
Jump to search
- (Kim et al., 2003) ⇒ Jin-Dong Kim, Tomoko Ohta, Yuka Teteisi, Jun'ichi Tsujii. (2003). “GENIA Corpus - a semantically annotated corpus for bio-textmining.” In: Bioinformatics. 19(suppl. 1).
Subject Headings: GENIA Corpus, Computational Molecular Biology.
Notes
Cited By
Quotes
Abstract
- Motivation
Natural language processing (NLP) methods are regarded as being useful to raise the potential of text mining from biological literature. The lack of an extensively annotated corpus of this literature, however, causes a major bottleneck for applying NLP techniques. GENIA corpus is being developed to provide reference materials to let NLP techniques work for bio-textmining.
- Results
GENIA corpus version 3.0 consisting of 2000 MEDLINE abstracts has been released with more than 400 000 words and almost 100 000 annotations for biological terms.
- Availability
GENIA corpus is freely available at http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2003 GENIAcorpus | Jun'ichi Tsujii Tomoko Ohta Jin-Dong Kim Yuka Teteisi | GENIA Corpus - a semantically annotated corpus for bio-textmining | Bioinformatics Subject Area | http://bioinformatics.oxfordjournals.org/cgi/content/abstract/19/suppl 1/i180 | 2003 |