kdd09cma1 Corpus
(Redirected from kdd09cma1 dataset)
Jump to navigation
Jump to search
The kdd09cma1 corpus is a semantic human annotated domain-specific corpus based on the research paper abstracts of KDD-2009.
- Context:
- It is the first version of the kdd09cma Corpus.
- It references the kddo1 Ontology.
- It can be downloaded from http://www.gabormelli.com/Projects/kdd/data/sigkdd/kdd09cma1/
- It is in The Public Domain.
- …
- Example(s):
- ...
- …
- Counter-Example(s):
- See: ACM SIGKDD, Annotated Text, Text Wikification System, Ontology, Semantic Wiki, GM-RKB WikiFixer System, Natural Language Processing, Knowledge Discovery.
References
2010
- (Melli, 2010a) ⇒ Gabor Melli. (2010). “Concept Mentions within KDD-2009 Abstracts (kdd09cma1) Linked to a KDD Ontology (kddo1)." In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010).
- QUOTE: The kdd09cma1 corpus is based on the 139 abstracts of the papers accepted for ACM's SIGKDD annual conference in 2009 (KDD 2009) that are freely accessible from ACM's Digital Library [1]. KDD is a competitive peer-reviewed conference with acceptance rates in the range of 20% -25%. The conference topic is data mining and knowledge discovery from databases.
The abstracts were manually annotated by the author for concept mentions. We define a concept mention to be a sequence of tokens (orthographic words and punctuation) whose meaning is deemed by an expert to be used within their community of speakers, and whose meaning is not necessarily well understood by a member of the general public. Often concept mentions are words (terminological units), but not always. The mentions can also be phrases. For example the phrase “problem of web classification” could be identified as a mention of the
Web_Object Classification_Task
concept.
- QUOTE: The kdd09cma1 corpus is based on the 139 abstracts of the papers accepted for ACM's SIGKDD annual conference in 2009 (KDD 2009) that are freely accessible from ACM's Digital Library [1]. KDD is a competitive peer-reviewed conference with acceptance rates in the range of 20% -25%. The conference topic is data mining and knowledge discovery from databases.