2009 TheARTCorpus

From GM-RKB

Jump to navigation Jump to search

(Liakata & Soldatova, 2009) ⇒ Maria Liakata, Larisa N. Soldatova. (2009). “The ART Corpus.” Technical report, Aberystwyth University.

Subject Headings: ART Corpus.

Notes

Cited By

~3 http://scholar.google.com/scholar?cites=8717018573970677041

Quotes

Abstract

The ART corpus consist of 225 papers manually annotated the CISP labels (i.e. “Goal", "Method", "Result"). The ART Corpus is >1 million words, 35,040 sentences. These papers cover topics in physical chemistry and biochemistry and were provided by the Royal Society of Chemistry (RSC) Publishing. The Corpus was developed primarily to add value to scientific papers, through semantic markup that would make it easier for natural language processing and semantic web applications to automatically extract information pertaining to core scientific concepts. The ART corpus can also be used as a training set for machine learning algorithms, in order to automate the annotation of papers with CISP meta-data. The corpus is available as a collection of 225 .xml files, where each file corresponds to a separate paper whose sentences have been annotated individually with core scientific concepts.

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2009 TheARTCorpus	Maria Liakata Larisa Soldatova			The ART Corpus			http://cadair.aber.ac.uk/dspace/handle/2160/1979			2009

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2009_TheARTCorpus&oldid=903111"

Facts

... more about "2009 TheARTCorpus"

Maria Liakata + and Larisa N. Soldatova +

The ART Corpus +

http://cadair.aber.ac.uk/dspace/handle/2160/1979 +

2009 +