Definition Extraction (DE) System

AKA: Definition Detection System.
Context:
- It can range from (typically) being a Definitional Sentence Extraction System to being a Definitional Paragraph Extraction System.
- It can range from being a Rule-based Definition Extraction System, to being a ML-based Definition Extraction System, to being a ANN-based Definition Extraction System.
- …
Example(s):
- a DefExt System (Espinosa-Anke et al., 2016),
- a Pattern Matching Definition Extraction System (e.g. Westerhout, 2009),
- a Semi-Structured Text Definition Extraction System (e.g. Curtotti et al., 2013),
- a Word-Class Lattices (WCLs) Definition Extraction System (Navigli & Velardi, 2010),
- a DefExplorer System (Leu & Ko, 2010).
- an ECODE System (Alarcon et al., 2009),
- an Evolutionary Definition Extraction System (e.g. Borg et al., 2009),
- an Indexed Reference Identification (IRI) Definition Extraction System (Bertin et al., 2009).
- a GlossExtractor DE System (Navigli & Velardi, 2007).
- …
Counter-Example(s):
See: Definitional Sentence Generation System, Automated Definitional Sentence Extraction Task, Bootstrapping Algorithm, Automatic Glossary Generation System, Taxonomy Learning System, Question-Answering System, Semantic Search System.

References

(Leu & Ko, 2010) ⇒ Fang-Yie Leu, and Chih-Chieh Ko (2010). "An Automated Term Definition Extraction System Using the Web Corpus in the Chinese Language". In: Journal Of Information Science And Engineering 26, 505-525 (2010).
- QUOTE: DefExplorer extracts definitions or their equivalences for Chinese terms in six phases, as shown in Fig. 1, including question analysis, document retrieval, semantics selection, similarity scoring, candidate grouping (also called candidate clustering), and answer generation. The first two phases respectively retrieve a given term's corresponding patterns, and submit the patterns to search results. The third phase removes semantically inappropriate search results sentences, and identifies the key portion of a definition sentence. In the fourth and fifth phases, DefExplorer calculates similarities between each sentence and other definition sentences, and clusters semantically similar sentences into a group. The last phase selects top-ranked sentences as the final results. In the following, we will describe the six phases, and explain why they are employed.

**Figure 1:** The DefExplorer system architecture.

(Westerhout, 2009) ⇒ Eline Westerhout (2009). "Definition Extraction using Linguistic and Structural Features". In: Proceeding of the 1st Workshop On Definition Extraction (WDE 2009).
- QUOTE: Different approaches for the extraction of definitions can be distinguished. We use a sequential combination of a rule-based approach and machine learning to extract them. As a first step a grammar is used to match sentences with a definition pattern and thereafter, machine learning techniques are applied to filter out those sentences that – although they have a definition pattern – do not qualify as definitions.

(Navigli & Velardi, 2007) ⇒ Roberto Navigli, and Paola Velardi (2007, September). "GlossExtractor: A Web Application to Automatically Create a Domain Glossary". In: Congress of the Italian Association for Artificial Intelligence (pp. 339-349). Springer, Berlin, Heidelberg.
- QUOTE: Figure 1 shows the basic steps of the glossary extraction algorithm. The input to the system a list $T$ of terms, for which a glossary has to be learned. Possibly, this list is the result of a previous terminology extraction process.
  The first phase is candidate extraction: for each term, definition sentences are searched first, in on-line glossaries, then, in on-line documents. Simple, manually defined regular expressions are used to extract the candidate definition sentences.

**Figure 1:** The glossary extraction algorithm.