ACE-2002
Jump to navigation
Jump to search
See: ACE Program, 2002.
References
2002
- http://www.nist.gov/speech/tests/ace/phase2/
- ACE-2 Version 1.0
- Year: 2003 (Sept 2002?)
- http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2003T11
- (Zhou et al., 2005)
- "According to the scope of the NIST Automatic Content Extraction (ACE) program, current research in IE has three main objectives: Entity Detection and Tracking (EDT), Relation Detection and Characterization (RDC), and Event Detection and Characterization (EDC). The EDT task entails the detection of entity mentions and chaining them together by identifying their coreference. In ACE vocabulary, entities are objects, mentions are references to them, and relations are semantic relationships between entities. Entities can be of five types: persons, organizations, locations, facilities and geo-political entities (GPE: geographically defined regions that indicate a political boundary, e.g. countries, states, cities, etc.). Mentions have three levels: names, nomial expressions or pronouns. The RDC task detects and classifies implicit and explicit relations between entities identified by the EDT task. For example, we want to determine whether a person is at a location, based on the evidence in the context. Extraction of semantic relationships between entities can be very useful for applications such as question answering, e.g. to answer the query “Who is the president of the United States?”.
- "In ACE (http://www.ldc.upenn.edu/Projects/ACE), explicit relations occur in text with explicit evidence suggesting the relationships. Implicit relations need not have explicit supporting evidence in text, though they should be evident from a reading of the document.
- ACE-2002 Overview
- a Training Dataset with 422 annotated Texts and a Testing Dataset with 90 Texts.
- 348 documents, 125K words and 4,400 relations.
- Contains text from two sources newswire (nwire) and broadcast news transcripts (bnews).
- The annotation consists of Coreference Tagging, Named Entity Tagging, and Semantic Relation Taggings.
- Five (5) Entity Types are annotated: PERSON, ORGANIZATION, GEO-POLITICAL ENTITY, LOCATION, FACILITY
- Five (5) Semantic Relations are annoteted: ROLE, PART, LOCATED, NEAR, and SOCIAL.
- "In total, there are 7,646 intra-sentential relations, or which 6,156 are in the training data and 1,490 in the test data.” (Bunescu and Mooney, 2007.
- "We use the official ACE corpus from LDC. The training set consists of 674 annotated text documents (~300k words) and 9683 instances of relations.” (Zhou et al., 2005)
- ACE-2002 Reported Results
(Scenario) Method | Precision | Recall | F-measure |
(S1) K4 | 70.3 | 26.3 | 38.0 |
(S1) SSK | 73.9 | 35.2 | 47.7 |
(S1) SPK-CCG | 67.5 | 37.2 | 48.0 |
(S1) SPK-CFG | 71.1 | 39.2 | 50.5 |
(S2) K4 | 67.1 | 35.0 | 45.8 |
(S2) SPK-CCG | 63.7 | 41.4 | 50.2 |
(S2) SPK-CFG | 65.5 | 43.8 | 52.5 |
ZSZZ05 | 63.1 | 49.5 | 55.5 |
- Legend
- K4: (Culotta and Sorensen, 2004)
- SSK, SPK-CCG, SPK-CFG: (Bunescu and Mooney, 2007.
- ZSZZ05: (Zhou et al., 2005)