Text Wikification System
A Text Wikification System is a text annotation system that implements a text wikification algorithm to solve a text wikification task.
- AKA: Wikifier.
- Context:
- It can be supported by:
- It can range from being a General Knowledge-based Wikification System to being a Domain-Specific Wikification System.
- It can range from (typically) being a Same-Language Wikification System to being a Cross-Language Wikification System.
- ...
- Example(s):
- Counter-Example(s):
- See: Document to Ontology Interlinking System, Natural Language Processing System, Wikimedia, Disambiguation to Wikipedia Task. GM-RKB WikiFixer System.
References
2019
- (Wikipedia, 2019) ⇒ https://en.wikipedia.org/wiki/Entity_linking Retrieved:2019-6-15.
- … Entity linking requires a knowledge base containing the entities to which entity mentions can be linked. A popular choice for entity linking on open domain text are knowledge-bases based on Wikipedia,[1] [2] in which each page is regarded as a named entity. NED using Wikipedia entities has been also called wikification (see Wikify! an early entity linking system[3] ). A knowledge base may also be induced automatically from training text [4] or manually built. (...)
- ↑ M. A. Khalid, V. Jijkoun and M. de Rijke (2008). The impact of named entity normalization on information retrieval for question answering. Proc. ECIR.
- ↑ Xianpei Han, Le Sun and Jun Zhao (2011). Collective entity linking in web text: a graph-based method. Proc. SIGIR.
- ↑ Rada Mihalcea and Andras Csomai (2007)Wikify! Linking Documents to Encyclopedic Knowledge. Proc. CIKM.
- ↑ Aaron M. Cohen (2005). Unsupervised gene/protein named entity normalization using automatically extracted dictionaries. Proc. ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics, pp. 17–24.
2014
- (Roth et al., 2014) ⇒ Dan Roth, Heng Ji, Ming-Wei Chang, and Taylor Cassidy. (2014). “Wikification and Beyond: The Challenges of Entity and Concept Grounding.” Tutorial at ACL 2014.
- QUOTE: Contextual disambiguation and grounding of concepts and entities in natural language text are essential to moving forward in many natural language understanding related tasks and are fundamental to many applications. The Wikification task (Bunescu and Pasca, 2006; Mihalcea and Csomai, 2007; Ratinov et al., 2011) aims at automatically identifying concept mentions appearing in a text document and link it to (or “ground it in”) a concept referent in a knowledge base (KB) (e.g., Wikipedia). For example, consider the sentence, "The Times report on Blumental (D) has the potential to fundamentally reshape the contest in the Nutmeg State.", a Wikifier should identify the key entities and concepts (Times, Blumental, D and the Nutmeg State), and disambiguate them by mapping them to an encyclopedic resource revealing, for example, that “D” here represents the Democratic Party, and that “the Nutmeg State” refers Connecticut.
2013
- (Cheng & Roth, 2013) ⇒ Xiao Cheng, and Dan Roth. (2013). “Relational Inference for Wikification.” In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
- QUOTE: Wikification (D2W), the task of identifying concepts and entities in text and disambiguating them into their corresponding Wikipedia page(...)
Given a document D containing a set of concept and entity mentions M (referred to later as surface), the goal of Wikification is to find the most accurate mapping from mentions to Wikipedia titles T; this mapping needs to take into account our understanding of the text as well as background knowledge that is often needed to determine the most appropriate title. We also allow a special NIL title that captures all mentions that are outside Wikipedia.
- QUOTE: Wikification (D2W), the task of identifying concepts and entities in text and disambiguating them into their corresponding Wikipedia page(...)
2011
- (Ratinov et al., 2011) ⇒ Lev Ratinov, Dan Roth, Doug Downey, and Mike Anderson. (2011). “Local and Global Algorithms for Disambiguation to Wikipedia.” In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1. ISBN:978-1-932432-87-9 .
- QUOTE: Wikification is the task of identifying and linking expressions in text to their referent Wikipedia pages(...) Previous studies on Wikification differ with respect to the corpora they address and the subset of expressions they attempt to link. For example, some studies focus on linking only named entities, whereas others attempt to link all “interesting” expressions, mimicking the link structure found in Wikipedia. Regardless, all Wikification systems are faced with a key Disambiguation to Wikipedia (D2W) task. In the D2W task, we’re given a text along with explicitly identified substrings (called mentions) to disambiguate, and the goal is to output the corresponding Wikipedia page, if any, for each mention. For example, given the input sentence “I am visiting friends in <Chicago>,” we output http://en.wikipedia.org/wiki/Chicago – the Wikipedia page for the city of Chicago, Illinois, and not (for example) the page for the 2002 film of the same name.
2007
- (Mihalcea & Csomai, 2007) ⇒ Rada Mihalcea, and Andras Csomai. (2007). “Wikify!: Linking documents to encyclopedic knowledge.” In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management (CIKM 2007). doi:10.1145/1321440.1321475
- QUOTE: Given a text or hypertext document, we define “text wikification” as the task of automatically extracting the most important words and phrases in the document, and identifying for each such keyword the appropriate link to a Wikipedia article. This is the typical task performed by the Wikipedia users when contributing articles to the Wikipedia repository(...) Automatic text wikification implies solutions for the two main tasks performed by a Wikipedia contributor when adding links to an article: (1) keyword extraction, and (2) link disambiguation.
The first task consists of identifying those words and phrases that are considered important for the document at hand(...)
The second task consists of finding the correct Wikipedia article that should be linked to a candidate keyword (...)
We designed and implemented a system that solves the “text wikification” problem in four steps, as illustrated in Figure 2.
Figure 2: The architecture of the system for automatic text wikification.
- QUOTE: Given a text or hypertext document, we define “text wikification” as the task of automatically extracting the most important words and phrases in the document, and identifying for each such keyword the appropriate link to a Wikipedia article. This is the typical task performed by the Wikipedia users when contributing articles to the Wikipedia repository(...) Automatic text wikification implies solutions for the two main tasks performed by a Wikipedia contributor when adding links to an article: (1) keyword extraction, and (2) link disambiguation.