CORA Benchmark Task
Jump to navigation
Jump to search
A CORA Benchmark Task is a Benchmark Task that ...
- AKA: CORA.
- …
- Example(s):
- CORA Citation Benchmark Task.
- Cora Citation Matching [reference matching, object correspondence]
- Text of citations hand-clustered into groups referring to the same paper.
- Cora Research Paper Classification [relational document classification]
- Research papers classified into a topic hierarchy with 73 leaves. We call this a relational data set, because the citations provide relations among papers.
- Cora Information Extraction [information extraction]
- Research paper headers and citations, with labeled segments for authors, title, institutions, venue, date, page numbers and several other fields.
- See: CORA Citation Search Engine, ACE Benchmark Task.
References
2004
- http://www.cs.umass.edu/~mccallum/code-data.html
- Cora Citation Matching [reference matching, object correspondence]
- Text of citations hand-clustered into groups referring to the same paper.
- Cora Research Paper Classification [relational document classification]
- Research papers classified into a topic hierarchy with 73 leaves. We call this a relational data set, because the citations provide relations among papers.
- Cora Information Extraction [information extraction]
- Research paper headers and citations, with labeled segments for authors, title, institutions, venue, date, page numbers and several other fields.
- Cora Citation Matching [reference matching, object correspondence]
- http://www.cs.umass.edu/~mccallum/research.html
- Extraction, Integration and Mining of Bibliographic Data
- Back in the 1990's I was the leader of the project at JustResearch that created Cora, a domain-specific search engine over computer science research papers. It currently contains over 50,000 postscript papers. You can read more about our research on Cora in our IRJ journal paper or a paper presented at the AAAI'99 Spring Symposium. The Cora team also included Kamal Nigam, Kristie Seymore, Jason Rennie, Huan Chang and Jason Reed.
- More recently we have been working on an enhanced alternative to Google Scholar, CiteSeer, and other digital libraries of the research literature. Our system, called Rexa, automatically extracts a de-duplicated cross-referenced database of not just papers (and references), but also people and grants, and so also publication venues and institutions. We also perform various kinds of topic and bibliometric impact analysis on this data.
- Extraction, Integration and Mining of Bibliographic Data
2000
- (McCallum et al., 2000c) ⇒ Andrew McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. (2000). “Automating the Construction of Internet Portals with Machine Learning.” In: Information Retrieval, 3(2). (doi:10.1023/A:1009953814988).