2008 WebscalreNER
- (Whitelaw et al., 2008) ⇒ Casey Whitelaw, Alex Kehlenbeck, Nemanja Petrovic, Lyle Ungar. (2008). “Web-scale named entity recognition.” In: Proceeding of the 17th ACM conference on Information and knowledge management (CIKM 2008). doi:10.1145/1458082.1458102
Subject Headings: Web-Scale Algorithm, Named Entity Recognition Algorithm.
Notes
Cited By
Quotes
Abstract
Automatic recognition of named entities such as people, places, organizations, books, and movies across the entire web presents a number of challenges, both of scale and scope. Data for training general named entity recognizers is difficult to come by, and efficient machine learning methods are required once we have found hundreds of millions of labeled observations. We present an implemented system that addresses these issues, including a method for automatically generating training data, and a multi-class online classification training method that learns to recognize not only high level categories such as place and person, but also more fine-grained categories such as soccer players, birds, and universities. The resulting system gives precision and recall performance comparable to that obtained for more limited entity types in much more structured domains such as company recognition in newswire, even though web documents often lack consistent capitalization and grammatical sentence construction.
,