2004 WebScaleIE
Jump to navigation
Jump to search
- (Etzioni et al., 2004) ⇒ Oren Etzioni, Michael J. Cafarella, Doug Downey, S. Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates. (2004). “Web-scale Information Extraction in KnowItAll: (preliminary results).” In: Proceedings of the 13th International World Wide Web Conference (WWW 2004). doi:10.1145/988672.988687
Subject Headings: Web-based Information Extraction.
Notes
Cited By
Quotes
Abstract
- Manually querying search engines in order to accumulate a large bodyof factual information is a tedious, error-prone process of piecemealsearch. Search engines retrieve and rank potentially relevantdocuments for human perusal, but do not extract facts, assessconfidence, or fuse information from multiple documents. This paperintroduces KnowItAll, a system that aims to automate the tedious process ofextracting large collections of facts from the web in an autonomous,domain-independent, and scalable manner.The paper describes preliminary experiments in which an instance of KnowItAll, running for four days on a single machine, was able to automatically extract 54,753 facts. KnowItAll associates a probability with each fact enabling it to trade off precision and recall. The paper analyzes KnowItAll's architecture and reports on lessons learned for the design of large-scale information extraction systems.
1.1 Previous Work
- …
- KNOWITALL is able to use weaker input than previous IE systems in part because, rather than extracting information from complex and potentially difficult-to-understand texts, KNOWITALL relies on the scale and redundancy of the web for an ample supply of simple sentences that are relatively easy to process. This notion of “redundancy-based extraction” was introduced in Mulder [17] and further articulated in AskMSR [2].
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2004 WebScaleIE | Doug Downey Stephen Soderland Michael J. Cafarella Daniel S. Weld Alexander Yates Oren Etzioni Ana-Maria Popescu Tal Shaked S. Kok | Web-scale Information Extraction in KnowItAll: (preliminary results) | Proceedings of the 13th International World Wide Web Conference | http://turing.cs.washington.edu/papers/www-paper.pdf | 10.1145/988672.988687 | 2004 |