Wikipedia-directed Wikification System
A Wikipedia-directed Wikification System is a text wikification system that implements a Wikipedia-directed wikification algorithm to solve a Wikipedia-directed wikification task.
- Context:
- It was initially proposed by Mihalcea & Csomai (2007),
- …
- Example(s):
- Counter-Example(s):
- See: Document to Ontology Interlinking System, Natural Language Processing System, Wikimedia, Wikitext, GM-RKB WikiFixer System.
References
2019
- (Wikipedia, 2019) ⇒ https://en.wikipedia.org/wiki/Entity_linking Retrieved:2019-6-15.
- In natural language processing, entity linking, named entity linking (NEL), named entity disambiguation (NED), named entity recognition and disambiguation (NERD) or named entity normalization (NEN)[1] is the task of determining the identity of entities mentioned in text. For example, given the sentence "Paris is the capital of France", the idea is to determine that "Paris" refers to the city of Paris and not to Paris Hilton or any other entity that could be referred as "Paris". NED is different from named entity recognition (NER) in that NER identifies the occurrence or mention of a named entity in text but it does not identify which specific entity it is.
Entity linking requires a knowledge base containing the entities to which entity mentions can be linked. A popular choice for entity linking on open domain text are knowledge-bases based on Wikipedia, [2] in which each page is regarded as a named entity. NED using Wikipedia entities has been also called wikification (see Wikify! an early entity linking system[3] ). A knowledge base may also be induced automatically from training text [4] or manually built. (...)
- In natural language processing, entity linking, named entity linking (NEL), named entity disambiguation (NED), named entity recognition and disambiguation (NERD) or named entity normalization (NEN)[1] is the task of determining the identity of entities mentioned in text. For example, given the sentence "Paris is the capital of France", the idea is to determine that "Paris" refers to the city of Paris and not to Paris Hilton or any other entity that could be referred as "Paris". NED is different from named entity recognition (NER) in that NER identifies the occurrence or mention of a named entity in text but it does not identify which specific entity it is.
- ↑ M. A. Khalid, V. Jijkoun and M. de Rijke (2008). The impact of named entity normalization on information retrieval for question answering. Proc. ECIR.
- ↑ Xianpei Han, Le Sun and Jun Zhao (2011). Collective entity linking in web text: a graph-based method. Proc. SIGIR.
- ↑ Rada Mihalcea and Andras Csomai (2007)Wikify! Linking Documents to Encyclopedic Knowledge. Proc. CIKM.
- ↑ Aaron M. Cohen (2005). Unsupervised gene/protein named entity normalization using automatically extracted dictionaries. Proc. ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics, pp. 17–24.
2014
- (Roth et al., 2014) ⇒ Dan Roth, Heng Ji, Ming-Wei Chang, and Taylor Cassidy. (2014). “Wikification and Beyond: The Challenges of Entity and Concept Grounding.” Tutorial at ACL 2014.
- QUOTE: Contextual disambiguation and grounding of concepts and entities in natural language text are essential to moving forward in many natural language understanding related tasks and are fundamental to many applications. The Wikification task (Bunescu and Pasca, 2006; Mihalcea and Csomai, 2007; Ratinov et al., 2011) aims at automatically identifying concept mentions appearing in a text document and link it to (or “ground it in”) a concept referent in a knowledge base (KB) (e.g., Wikipedia). For example, consider the sentence, "The Times report on Blumental (D) has the potential to fundamentally reshape the contest in the Nutmeg State.", a Wikifier should identify the key entities and concepts (Times, Blumental, D and the Nutmeg State), and disambiguate them by mapping them to an encyclopedic resource revealing, for example, that “D” here represents the Democratic Party, and that “the Nutmeg State” refers Connecticut.
2013
- (Cheng & Roth, 2013) ⇒ Xiao Cheng, and Dan Roth. (2013). “Relational Inference for Wikification.” In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
- QUOTE: Wikification (D2W), the task of identifying concepts and entities in text and disambiguating them into their corresponding Wikipedia page(...)
Given a document D containing a set of concept and entity mentions M (referred to later as surface), the goal of Wikification is to find the most accurate mapping from mentions to Wikipedia titles T; this mapping needs to take into account our understanding of the text as well as background knowledge that is often needed to determine the most appropriate title. We also allow a special NIL title that captures all mentions that are outside Wikipedia.
- QUOTE: Wikification (D2W), the task of identifying concepts and entities in text and disambiguating them into their corresponding Wikipedia page(...)
2012
- (Cassidy et al., 2012) ⇒ Taylor Cassidy, Heng Ji, Lev-Arie Ratinov, Arkaitz Zubiaga, and Hongzhao Huang. (2012). “Analysis and Enhancement of Wikification for Microblogs with Context Expansion.” In: Proceedings of COLING-2012 (COLING-2012).
- QUOTE: Disambiguation to Wikipedia (D2W) is the task of linking mentions of concepts in text to their corresponding Wikipedia entries. Most previous work has focused on linking terms in formal texts (e.g. newswire) to Wikipedia. Linking terms in short informal texts (e.g. tweets) is difficult for systems and humans alike as they lack a rich disambiguation context.
2011
- (Ratinov et al., 2011) ⇒ Lev Ratinov, Dan Roth, Doug Downey, and Mike Anderson. (2011). “Local and Global Algorithms for Disambiguation to Wikipedia.” In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1. ISBN:978-1-932432-87-9 .
- QUOTE: Wikification is the task of identifying and linking expressions in text to their referent Wikipedia pages(...) Previous studies on Wikification differ with respect to the corpora they address and the subset of expressions they attempt to link. For example, some studies focus on linking only named entities, whereas others attempt to link all “interesting” expressions, mimicking the link structure found in Wikipedia. Regardless, all Wikification systems are faced with a key Disambiguation to Wikipedia (D2W) task. In the D2W task, we’re given a text along with explicitly identified substrings (called mentions) to disambiguate, and the goal is to output the corresponding Wikipedia page, if any, for each mention. For example, given the input sentence “I am visiting friends in <Chicago>,” we output http://en.wikipedia.org/wiki/Chicago – the Wikipedia page for the city of Chicago, Illinois, and not (for example) the page for the 2002 film of the same name.
2008
- (Csomai & Mihalcea, 2008) ⇒ Andras Csomai, and Rada Mihalcea. (2008). “Linking Documents to Encyclopedic Knowledge.” In: IEEE Intelligent Systems 23(5). doi:10.1109/MIS.2008.86
2007
- (Mihalcea & Csomai, 2007) ⇒ Rada Mihalcea, and Andras Csomai. (2007). “Wikify!: Linking documents to encyclopedic knowledge.” In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management (CIKM 2007). doi:10.1145/1321440.1321475
- QUOTE: Given a text or hypertext document, we define “text wikification” as the task of automatically extracting the most important words and phrases in the document, and identifying for each such keyword the appropriate link to a Wikipedia article. This is the typical task performed by the Wikipedia users when contributing articles to the Wikipedia repository(...) Automatic text wikification implies solutions for the two main tasks performed by a Wikipedia contributor when adding links to an article: (1) keyword extraction, and (2) link disambiguation.
The first task consists of identifying those words and phrases that are considered important for the document at hand(...)
The second task consists of finding the correct Wikipedia article that should be linked to a candidate keyword (...)
We designed and implemented a system that solves the “text wikification” problem in four steps, as illustrated in Figure 2.
Figure 2: The architecture of the system for automatic text wikification.
- QUOTE: Given a text or hypertext document, we define “text wikification” as the task of automatically extracting the most important words and phrases in the document, and identifying for each such keyword the appropriate link to a Wikipedia article. This is the typical task performed by the Wikipedia users when contributing articles to the Wikipedia repository(...) Automatic text wikification implies solutions for the two main tasks performed by a Wikipedia contributor when adding links to an article: (1) keyword extraction, and (2) link disambiguation.