Open Information Extraction System
Jump to navigation
Jump to search
An Open Information Extraction (OIE) System is an IE System that can solve an Open IE Task.
- Context:
- It can (typically) be an Open IE from Text System.
- Example(s):
- Counter-Example(s):
- See: Open Question-Answering System, Natural Language Processing, Named-Entity Recognizer, Machine Learning System.
References
2007
- (Banko et al., 2007) ⇒ Michele Banko, Michael J. Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni (2007, January). "Open information extraction from the web". In IJCAI (Vol. 7, pp. 2670-2676).
- QUOTE: This paper introduces Open Information Extraction (OIE)— a novel extraction paradigm that facilitates domain independent discovery of relations extracted from text and readily scales to the diversity and size of the Web corpus. The sole input to an OIE system is a corpus, and its output is a set of extracted relations. An OIE system makes a single pass over its corpus guaranteeing scalability with the size of the corpus (...)
The paper reports on TEXTRUNNER, the first scalable, domain-independent OIE system. TEXTRUNNER is a fully implemented system that extracts relational tuples from text. The tuples are assigned a probability and indexed to support efficient extraction and exploration via user queries.
The main contributions of this paper are to:
- Introduce Open Information Extraction (OIE) — a new extraction paradigm that obviates relation specificity by automatically discovering possible relations of interest while making only a single pass over its corpus.
- Introduce TEXTRUNNER, a fully implemented OIE system, and highlight the key elements of its novel architecture. The paper compares TEXTRUNNER experimentally with the state-of-the-art Web IE system, KNOWITALL, and show that TEXTRUNNER achieves a 33% relative error reduction for a comparable number of extractions.
- Report on statistics over TEXTRUNNER’s 11,000,000 highest probability extractions, which demonstrates its scalability, helps to assess the quality of its extractions, and suggests directions for future work.
- QUOTE: This paper introduces Open Information Extraction (OIE)— a novel extraction paradigm that facilitates domain independent discovery of relations extracted from text and readily scales to the diversity and size of the Web corpus. The sole input to an OIE system is a corpus, and its output is a set of extracted relations. An OIE system makes a single pass over its corpus guaranteeing scalability with the size of the corpus (...)