GENIA Corpus
(Redirected from GENIA Dataset)
Jump to navigation
Jump to search
The GENIA Corpus is an Annotated Abstracts Dataset of Biomedicine Abstracts that have been Curated for Entities in the GENIA Ontology.
- AKA: GENIA Dataset.
- Context:
- Composed of 2000 Medline abstracts
- Approximately 500,000 words
- http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA
- See: GENIA Project, BioCreAtIvE Corpus.
References
2003
- (Kim et al., 2003) ⇒ Jin-Dong Kim, Tomoko Ohta, Yuka Teteisi, and Jun'ichi Tsujii. (2003). “GENIA Corpus - a semantically annotated corpus for bio-textmining.” In: Bioinformatics. 19(suppl. 1).
2002
- (Ohta et al., 2002) ⇒ Tomoko Ohta, Yuka Tateisi, and Jin-Dong Kim. (2002). “The GENIA corpus: an annotated research abstract corpus in molecular biology domain.” In: Proceedings of the 2nd International Conference on Human Language Technology Research (HLT 2002).