GENIA Corpus

From GM-RKB

(Redirected from GENIA Dataset)

Jump to navigation Jump to search

The GENIA Corpus is an Annotated Abstracts Dataset of Biomedicine Abstracts that have been Curated for Entities in the GENIA Ontology.

AKA: GENIA Dataset.
Context:
- Composed of 2000 Medline abstracts
- Approximately 500,000 words
- http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA
See: GENIA Project, BioCreAtIvE Corpus.

References

2003

(Kim et al., 2003) ⇒ Jin-Dong Kim, Tomoko Ohta, Yuka Teteisi, and Jun'ichi Tsujii. (2003). “GENIA Corpus - a semantically annotated corpus for bio-textmining.” In: Bioinformatics. 19(suppl. 1).

2002

(Ohta et al., 2002) ⇒ Tomoko Ohta, Yuka Tateisi, and Jin-Dong Kim. (2002). “The GENIA corpus: an annotated research abstract corpus in molecular biology domain.” In: Proceedings of the 2nd International Conference on Human Language Technology Research (HLT 2002).

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=GENIA_Corpus&oldid=762119"

Concept