Relation Mention Recognition Algorithm
A relation mention recognition algorithm is a recognition algorithm (detection and classification) that can solve a relation mention recognition task.
- AKA: Relation Recognition Algorithm, Semantic Relation Mention Recognition Algorithm, Relation Extraction Algorithm, Relation Extraction from Text Algorithm.
- Context:
- It can be composed of a Relation Mention Detection Algorithm and a Relation Mention Classification Algorithm.
- It can support a Relation Mention Extraction Algorithm (e.g. as part of an Information Extraction Algorithm).
- It can range from being a Simple Semantic Relation Recognition Algorithm to being a Complex Semantic Relation Recognition Algorithm/Complex Relation Extraction Algorithm.
- It can range from being a Pattern-based Relation Mention Recognition Algorithm to being a Cluster-based Relation Mention Recognition Algorithm.
- It can range from being a Heuristic Relation Mention Recognition Algorithm to being a Data-Driven Relation Mention Recognition Algorithm (such as a Supervised Relation Mention Recognition Algorithm).
- It can require Preprocessing tasks such as Chunking, Named Entity Recognition, Syntactic Parsing, Semantic Role Labeling.
- It can involve a Positive Sentence Harvesting phase.
- Approaches include: Word-sequence Rule, Co-occurrence Pattern, Word-sequence Pattern, Syntactic Pattern.
- It can be applied by a Relation Mention Recognition System.
- It has been evaluated with the PPLRE Automated Evaluation System.
- Example(s):
- AutoSlog,
- BBN SIFT,
- BRN,
- DIPRE,
- Espresso Algorithm,
- FASTUS,
- iProLINK (http://pir.georgetown.edu/iprolink/)
- LEILA,
- Naive Relation Recognition Algorithm,
- Protein-Protein Interaction Mention Recognition Algorithm,
- PubGene (http://www.pubgene.org) There is no direct Co-Occurence for the gene and protein crcA to the exact keyword expression "localization" in MedLine abstracts,
- Snowball Algorithm,
- Shortest Path Dependency Relation Extraction Algorithm (Bunescu & Mooney, 2005),
- TeGRR Algorithm, a Complex Semantic Relation Mention Recognition Algorithm,
- TextRunner Algorithm,
- Wrapper Induction Algorithm,
- ZParser,
- ZParser Bootstrapped.
- …
- Counter-Example(s):
- See: Sentence-level Analysis, Discourse-level Analysis, PPLRE Project, PPLRE Evaluation - Snowball.
References
2012
- (Melli, 2012) ⇒ Gabor Melli. (2012). “Identifying Untyped Relation Mentions in a Corpus given an Ontology.” In: Workshop Proceedings of TextGraphs-7: Graph-based Methods for Natural Language Processing (TextGraphs-7).
- (Lao, Subramanya et al., 2012) ⇒ Ni Lao, Amarnag Subramanya, Fernando Pereira, and William W. Cohen. (2012). “Reading The Web with Learned Syntactic-Semantic Inference Rules." In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing, and Computational Natural Language Learning (EMNLP-CoNLL, 2012).
2009
- (Rajaraman & Ullman, 2009o) ⇒ Anand Rajaraman, and Jeffrey D. Ullman. (2009). “Relation Extraction." Stanford Course - CS345A, Winter 2009: Data Mining. 2/2-2/4
2007a
- (Busescu & Mooney, 2007) ⇒ Razvan C. Bunescu, and Raymond Mooney. (2007). “Learning to Extract Relations from the Web using Minimal Supervision.” In: Proceedings of 2007 ACL Conference (ACL 2007).
2007b
- (Melli et al., 2007) ⇒ Gabor Melli, Martin Ester, and Anoop Sarkar. (2007). “Recognition of Multi-sentence n-ary Subcellular Localization Mentions in Biomedical Abstracts.” In: Proceedings of LBM-2007. (presentation )
2007c
- (Banko et al., 2007) ⇒ Michele Banko, M. J. Cafarella, S. Soderland, M. Broadhead, and Oren Etzioni. (2007). “Open Information Extraction from the Web.” In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-2007).
2007d
- (Nguyen et al., 2007a) ⇒ Dat P.T. Nguyen, Yutaka Matsuo, Mitsuru Ishizuka. (2007). “Relation Extraction from Wikipedia Using Subtree Mining.” In: Proceedings of AAAI 2007 (AAAI 2007).
2007 e.
- (Busescu and Mooney, 2007) ⇒ Razvan C. Bunescu and Raymond Mooney. (2007). “Learning to Extract Relations from the Web using Minimal Supervision.” In: Proceedings of ACL-2007.
2007f
- (Bunescu and Mooney, 2007) ⇒ Razvan C. Bunescu and Raymond Mooney. (2007). “Extracting Relations from Text: From Word Sequences to Dependency Paths." In, Text Mining and Natural Language Processing, Anne Kao and Steve Poteet (eds.), pp. 29-44, Springer.
2007g
- (Fundel et al., 2007) ⇒ Katrin Fundel, R. Kuffner, and R. Zimmer. (2007). “RelEx--relation extraction using dependency parse trees." Bioinformatics. 2007 Feb 1;23(3):365-71.
- QUOTE:The simplest approach is the detection of co-occurrences of entities from within sentences or abstracts (Ding et al., 2002; Jelier et al., 2005; Jenssen et al., 2001). It relies on the hypothesis that entities which are repeatedly mentioned together are somehow related. Extracted relations exhibit high sensitivity but very low specificity. Generally, the type and direction of the relation cannot be determined.
Pattern based extraction approaches (Blaschke et al., 1999; Blaschke and Valencia, 2001; Leroy and Chen, 2002; Ono et al., 2001) were set up to increase specificity, yet they achieve significantly lower recall.
As an extension to standard relation extraction pipelines, we propose the use of dependency parse trees (Klein and Manning, 2002, 2003; Mel’cuk, 1988) as a means for biomedical relation extraction. Dependency parse trees reveal non-local dependencies within sentences, i.e. between words that are far apart in a sentence. Sentences of biomedical texts tend to be long and complicated and frequently mention a number of possible effectors and effectees. Dependency parse trees provide a useful structure for the sentences by annotating edges with dependency types, e.g. subject, auxiliary, modifier.
- QUOTE:The simplest approach is the detection of co-occurrences of entities from within sentences or abstracts (Ding et al., 2002; Jelier et al., 2005; Jenssen et al., 2001). It relies on the hypothesis that entities which are repeatedly mentioned together are somehow related. Extracted relations exhibit high sensitivity but very low specificity. Generally, the type and direction of the relation cannot be determined.
2007h
- (Jiang and Zhai, 2007) ⇒ J. Jiang and C. Zhai, (2007). “A Systematic Exploration of the Feature Space for Relation Extraction.” In: Proceedings of NAACL/HLT-2007.
2006a
- (Zhang et al., 2006b) ⇒ Min Zhang, Jie Zhang, and Jian Su. (2006). “Exploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel.” In: Proceedings of HLT Conference (HLT 2006).
2006b
- (HassanHN, 2006 ⇒ H. Hassan, A. Hassan and S. Noeman. (2006). “Graph Based Semi-Supervised Approach for Information Extraction.” In: Proceedings of the Workshop on Graph-based methods for NLP at HLT/NAACL-2006.
2006c
- (Suchanek et al., 2006) ⇒ F. M. Suchanek, G. Ifrim and G. Weikum. (2006). “Combining Linguistic and Statistical Analysis to Extract Relations from Web Documents.” In: Proceedings of KDD-2006.
2006d
- (Girju et al., 2006) ⇒ Roxana Girju, Adriana Badulescu, and Dan Moldovan. (2006). “Automatic Discovery of Whole-Part Relations.” In: Computational Linguistics, 32(1). doi:10.1162/coli.2006.32.1.83
2006 e.
- (Culotta et al., 2006) ⇒ Aron Culotta, Andrew McCallum, and Jonathan Betz. (2006). “Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text.” In: Proceedings of HLT-NAACL 2006.
- QUOTE: ... Common approaches to this problem include pattern matching (Brin, 1998; Agichtein and Gravano, 2000), kernel methods (Zelenko et al, 2003; Culotta and Sorensen, 2004; Bunescu & Mooney, 2005), logistic regression (Kambhatla, 2004), and augmented parsing (Miller et al., 2000).
2006f
- (Feldman et al., 2006) ⇒ Ronen Feldman, B. Rosenfeld, S. Soderland, and Oren Etzioni. (2006). “Self-Supervised Relation Extraction from the Web.” In: Proceedings of ISMIS 2006.
2006g
- (Xia, 2006) ⇒ L. Xia. (2006). “Adaptive Relationship Extraction by Machine Learning." Masters Thesis, Sheffield University.
2006h
- (Suchanek et al., 2006) ⇒ F. M. Suchanek and G. Ifrim and G. Weikum. (2006). “Combining Linguistic and Statistical Analysis to Extract Relations from Web Documents.” In: Proceedings of KDD-2006. (paper.pdf)
2006i
- (Greenwood and Stevenson, 2006) ⇒ M. A. Greenwood and M. Stevenson. (2006). “Improving Semi-Supervised Acquisition of Relation Extraction Patterns.” In: Proceedings of the Information Extraction Beyond The Document Workshop (COLING/ACL 2006). (paper.pdf)
2006
- (Chakavarthy et al., 2006) ⇒ Venkatesan T. Chakaravarthy, H. Gupta, P. Roy, and M. Mohania. (2006). “Efficiently Linking Text Documents with Relevant Structured Information.” In: Proceedings of VLDB, 2006.
2005a
- (Gonzalez et al., 2005) ⇒ M. Gonzalez, V. L. S. de Lima and J. V. de Lima. (2005). “Binary Lexical Relations for Text Representation in Information Retrieval.” In: Proceedings of 10th International Conference on Applications of Natural Language to Information Systems (NLDB-2005). (website)
2005b
- (Harabagiu et al., 2005) ⇒ Sanda M. Harabagiu, C. A. Bejan and P. Morarescu. (2005). “Shallow Semantics for Relation Extraction.” In: Proceedings of IJCAI-2005. (paper.pdf)
2005c
- (Bunescu & Mooney, 2005) ⇒ Razvan C. Bunescu, and Raymond Mooney. (2005). “A Shortest Path Dependency Kernel for Relation Extraction.” In: Proceedings of HLT/EMNLP-2005.
2005d
- (Moreda et al., 2005) ⇒ P. Moreda, B. Navarro and M. Palomar. (2005). “Using Semantic Roles in Information Retrieval Systems.” In: Proceedings of 10th International Conference on Applications of Natural Language to Information Systems (NLDB-2005). (website)
2005 e.
- (Harabagiu et al., 2005) ⇒ Sanda M. Harabagiu, C. A. Bejan and P. Morarescu. (2005). “Shallow Semantics for Relation Extraction.” In: Proceedings of IJCAI-2005.
2005f
- (Zhao & Grishman, 2005) ⇒ Shubin Zhao, and Ralph Grishman. (2005). “Extracting Relations with Integrated Information Using Kernel Methods.” In: Proceedings of ACL Conference (ACL 2005).
2005g
- (McDonald et al., 2005) ⇒ Ryan T. McDonald, Fernando C. N. Pereira, Seth Kulick, R. Scott Winters, Yang Jin, Peter S. White. (2005). “Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE.” In: ACL-2005.
2005h
- (Ramani et al., 2005) ⇒ A. K. Ramani, Razvan C. Bunescu, Raymond Mooney and E. M. Marcotte. (2005). “Consolidating the Set of Known Human Protein-Protein Interactions in Preparation for Large-Scale Mapping of the Human Interactome." Genome Biology, volume 6, number 5, r40.
2005i
- (Dong et al., 2005) ⇒ X. Dong, A. Halevy, and J. Madhavan. (2005). “Reference Reconciliation in Complex Information Spaces.” In: Proceedings of SIGMOD, 2005.
2004a
- (Rosenfeld et al., 2004) ⇒ B. Rosenfeld, Ronen Feldman, M. Fresko, J. Schler, and Y. Auman. (2004). “TEG - A Hybrid Approach to Information Extraction.” In: Proceedings of the 2004 CIKM Conference (CIKM 2004).
2004b
- (Jijkoun et al., 2004) ⇒ Valentin Jijkoun, Jori Mur, and Maarten de Rijke\n. (2004). “Information extraction for question answering: Improving recall through syntactic patterns.” In: Proceedings of COLING-2004.
2004c
- (Culotta and Sorensen, 2004) ⇒ Aron Culottaand J. S. Sorensen. (2004). “Dependency Tree Kernels for Relation Extraction.” In: Proceedings ofACL 2004.
2003
- (ZelenkoAR, 2003) ⇒ D. Zelenko, C. Aone, and A. Richardella. (2003). “Kernel Methods for Relation Extraction. Journal of Machine Learning Research.
- (Arasu & Garcia-Molina, 2003) ⇒ A. Arasu, and Hector Garcia-Molina. (2003). “Extracting structured data from web pages.” In: Proceedings of the 2003 ACM SIGMOD international conference (SIGMOD 2003). doi:10.1145/872757.872799
2001
- (Crescenzi et al., 2001) ⇒ Valter Crescenzi, Giansalvatore Mecca, and Paolo Merialdo. (2001). “RoadRunner: Towards Automatic Data Extraction from Large Web Sites.” In: Proceedings of the 27th International Conference on Very Large Data Bases (VLDB 2001).
2000a
- (Agichtein and Gravano, 2000) ⇒ Eugene Agichtein and L. Gravano. (2000). ??
2000b
- (McCallum et al., 2000) ⇒ Andrew McCallum, K. Nigam, J. Rennie, and K. Seymore. (2000). “Automating the construction of internet portals with machine learning.” In: Information Retrieval Journal.
2000c
- (Kushmerick, 2000) ⇒ Nicholas Kushmerick. (2000). “Wrapper induction: Efficiency and Expressiveness.” In: Artificial Intelligence 118(1-2).
1999
- (Soderland, 1999) ⇒ Steven Soderland. (1999). “[http://portal.acm.org/citation.cfm?id=309510%7CLearning Information Extraction Rules for Semi-Structured and Free Text.” In: Machine learning.
1998a
- Dayne Freitag. (1998). “Information Extraction from HTML: Application of a general learning approach." Proceedings of the Fifteenth Conference on Artificial Intelligence (AAAI/IAAI 1998).
1998b
- (Giles et al., 1998) ⇒ C. L. Giles, K. Bollacker, and S. Lawrence. (1998). CiteSeer: An automatic citation indexing system. The Third ACM Conference on Digital Libraries.
1998c
- (Craven et al., 1998) ⇒ M. Craven, D. DiPasquo, Dayne Freitag, Andrew McCallum, Tom M. Mitchell, K. Nigam, and S. Slattery. (1998). Learning to extract symbolic knowledge from the world wide web. In: Proceedings of AAAI-98.
1998d
- (Brin, 1998) ⇒ S. Brin. (1998). “Extracting patterns and relations from the World-Wide Web.” In: Proceedings of on the 1998 International Workshop on Web and Databases (WebDB’98).
1997a
- (Khoo, 1997) ⇒ C. Khoo. 1997. The Use of Relation Matching in Information Retrieval. LIBRES: Library and Information Science Research Electronic Journal, 7(2). (paper.html)
1997b
- (Kushmerick, 1995) ⇒ Nicholas Kushmerick, D. S. Weld, and R. B. Doorenhos. (1997). “Wrapper Induction for Information Extraction.” In: Intl. Joint Conference on Artificial Intelligence (IJCAI 1997).
- Use of Wrapper Algorithm with induction
- Wrapper Induction Algorithm, Relation Mention Pattern.
1997c
- (Leek, 1997) ⇒ T. R. Leek. (1997). Information Extraction using Hidden Markov Models. Master's thesis, UC San Diego. http://citeseer.ist.psu.edu/leek97information.html
- Application of Hidden Markov Models
1997d
- (Soderland, 1997) ⇒ Stephen Soderland. (1997). “Learning to extract Text Based information from the World Wide Web.” In: ...?
1996
- (Michell et al., 1996) ⇒ Tom Mitchell, ...
1995a
- (Soderland et al., 1995) ⇒ Stephen Soderland, David Fisher, Jonathan Aseltine, and Wendy G. Lehnert. (1995). “CRYSTAL: Inducing a Conceptual Dictionary.” In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI 1995).
1995b
- (Khoo, 1995) ⇒ Christopher Soo-Guan Khoo. (1995). “Automatic Identification of Causal Relations in Text and Their Use for Improving Precision in Information Retrieval." Doctoral dissertation, Syracuse University.
- An early reference that I have discovered to contain the phrase "semantic relation" (in the context of computing science).
- QUOTE: By semantic relation, I mean the logical or conceptual relation expressed in the text but not wholly dependent on the particular syntactic structure of the sentence
1993
- (Riloff, 1993) ⇒ Ellen Riloff. (1993). “Automatically Constructing a Dictionary for Information Extraction Tasks.” In: Proceedings of the 11th Ann. Conference of Artificial Intelligence (AAAI 1993).
1992
- (Hearst, 1992) ⇒ Marti Hearst. (1992). “Automatic Acquisition of Hyponyms from Large Text Corpora.” In: Proceedings of the 14th International Conference on Computational Linguistics (COLING-1992).(paper.pdf)
1991
- (Rau, 1991) ⇒ L. Rau. (1991). “Extracting Company Names From Text.” In: Proceedings of the Sixth Conference on Artificial Intelligence Applications.
1982
- De Jong (1982)
- NOTE: FRUMP system filled in Schank-style “scripts” from newswires; DARPA’s Message Understanding Conference (MUC) [87’-95’], and TIPSTER [92’-96’]
Notes
- Early start was on news feeds and programed rules (e.g. FSM)
- E.g. De Jong’s FRUMP [1982] that filled in Schank-style “scripts” from newswires; DARPA’s Message Understanding Conference (MUC) [87’-95’], and TIPSTER [92’-96’]
- E.g. The finite state machines of SRI’s FASTUS.
- HMM’s in Elkan [Leek 1997]
- BBN in [Bikel, et al, 1998]