GENIA Corpus

From GM-RKB

Jump to navigation Jump to search

The GENIA Corpus is an Annotated Abstracts Dataset of Biomedicine Abstracts that have been Curated for Entities in the GENIA Ontology.

AKA: GENIA Dataset.
Context:
- Composed of 2000 Medline abstracts
- Approximately 500,000 words
- http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA
See: GENIA Project, BioCreAtIvE Corpus.

References

2003

(Kim et al., 2003) ⇒ Jin-Dong Kim, Tomoko Ohta, Yuka Teteisi, and Jun'ichi Tsujii. (2003). “GENIA Corpus - a semantically annotated corpus for bio-textmining.” In: Bioinformatics. 19(suppl. 1).

2002

(Ohta et al., 2002) ⇒ Tomoko Ohta, Yuka Tateisi, and Jin-Dong Kim. (2002). “The GENIA corpus: an annotated research abstract corpus in molecular biology domain.” In: Proceedings of the 2nd International Conference on Human Language Technology Research (HLT 2002).

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=GENIA_Corpus&oldid=762119"

Concept