Andrew McCallum
Jump to navigation
Jump to search
Andrew McCallum is a person.
- Context:
- He is a Data Mining Researcher, with a focus on CRFs and Factor Graphs.
- He is a Data Mining Practitioner, with a focus on Information Extraction.
- He initiated the MALLET software development project.
- He initiated the FACTORIE software development project.
- See: CORA, University of Massachusetts.
References
- Professional Homepage: http://www.cs.umass.edu/~mccallum
- DBLP Author Page” http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/m/McCallum:Andrew.html
- Google Scholar Page: http://scholar.google.com/citations?user=yILa1y0AAAAJ
- http://www.gabormelli.com/RKB/Special:SearchByProperty/author/Andrew-20McCallum
2022
- (Das et al., 2022) ⇒ Rajarshi Das, Ameya Godbole, Ankita Naik, Elliot Tower, Manzil Zaheer, Hannaneh Hajishirzi, Robin Jia, and Andrew McCallum. (2022). “Knowledge Base Question Answering by Case-based Reasoning over Subgraphs.” In: International conference on machine learning, pp. 4777-4793 . PMLR,
2019
- (Strubell et al., 2019) ⇒ Emma Strubell, Ananya Ganesh, and Andrew McCallum. (2019). “Energy and Policy Considerations for Deep Learning in NLP.” arXiv preprint arXiv:1906.02243
2017
- (Das et al., 2017) ⇒ Rajarshi Das, Arvind Neelakantan, David Belanger, and Andrew McCallum. (2017). “Chains of Reasoning over Entities, Relations, and Text Using Recurrent Neural Networks.” In: Proceedings of EACL-2017.
2016
- (McCallum, 2016) ⇒ Andrew McCallum. (2016). “Universal Schema for Representation and Reasoning from Natural Language.” Invited Talk at the 5th Workshop on Automated Knowledge Base Construction (AKBC-2016).
- ABSTRACT: Interest in creating KBs has often been motivated by the desire to support reasoning on information that would otherwise be expressed in noisy free text and spread across multiple documents. However, distilling knowledge into a restricted KB can lose important semantic diversity and context. Traditionally a KB has a single hand-designed schema of entity- and relation-types. In contrast, universal schema operates on the union of many input schemas, including a great diversity of free textual expressions. However, previous work on universal schema still distills many textual contexts of the relation between an entity pair into a single embedded vector. In this talk I will introduce universal schema, then describe recent work leading toward (a) having the textual entity- and relation-mentions themselves represent the KB, (b) using universal schema and neural attention models to provide generalization, (c) logical reasoning on top of this text-KB, and (d) future work on reinforcement learning to guide the search for proofs of the answers to queries.
2015
- (Vilnis & McCallum, 2015) ⇒ Luke Vilnis, and Andrew McCallum. (2015). “Word Representations via Gaussian Embedding.” In: submitted to ICRL 2015.
- (McCallum, 2015) ⇒ Andrew McCallum. (2015). “Representation and Reasoning with Universal Schema Embeddings.” Invited Talk at ICSW-2015
2014
- (Kobren et al., 2014) ⇒ Ari Kobren, Thomas Logan, Siddarth Sampangi, and Andrew McCallum. (2014). “Domain Specific Knowledge Base Construction via Crowdsourcing.” In: Proceedings of NIPS workshop on Automated Knowledge Base Construction (AKBC 2014).
- (Vilnis & McCallum, 2014) ⇒ Luke Vilnis, and Andrew McCallum. (2014). “Word Representations via Gaussian Embedding.” In: CoRR, abs/1412.6623.
2013
- (Riedel et al., 2013) ⇒ Sebastian Riedel, Limin Yao, Andrew McCallum, and Benjamin M. Marlin. (2013). “Relation Extraction with Matrix Factorization and Universal Schemas.” In: Proceedings of the Joint Human Language Technology Conference/Annual Meeting of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2013).
- (Anzaroot & McCallum, 2013) ⇒ Sam Anzaroot, and Andrew McCallum. (2013). “A New Dataset for Fine-grained Citation Field Extraction.” In: Proceedings of ICML Workshop on Peer Reviewing and Publishing Models (PEER-2013).
2010
- (Singh et al., 2010) ⇒ Sameer Singh, Limin Yao, Sebastian Riedel, and Andrew McCallum. (2010). “Constraint-Driven Rank-based Learning for Information Extraction.” In: Proceedings of Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2010).
- (Riedel et al., 2010) ⇒ Sebastian Riedel, Limin Yao, and Andrew McCallum. (2010). “Modeling Relations and their Mentions Without Labeled Text.” In: Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases (ECML/PKDD-2010).
2009
- (McCallum et al., 2009) ⇒ Andrew McCallum, Karl Schultz, and Sameer Singh. (2009). “FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs..” In: Advances in Neural Information Processing Systems 22 (NIPS 2009).
- (Yao et al., 2009) ⇒ Limin Yao, David Mimno, and Andrew McCallum. (2009). “Efficient Methods for Topic Model Inference on Streaming Document Collections.” In: Proceedings of ACM SIGKDD Conference (KDD-2009). 10.1145/1557019.1557121
- (Wick et al., 2009) ⇒ Michael Wick, Aron Culotta, Khashayar Rohanimanesh, and Andrew McCallum. (2009). “An Entity Based Model for Coreference Resolution.” In: Proceedings of the SIAM International Conference on Data Mining (SDM 2009).
- (Bellare & McCallum, 2009) ⇒ Kedar Bellare, and Andrew McCallum. (2009). “Generalized Expectation Criteria for Bootstrapping Extractors using Record-Text Alignment..” In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP 2009).
2008
- (Wick et al., 2008) ⇒ Michael Wick, Khashayar Rohanimanesh, Karl Schultz, and Andrew McCallum. (2008). “A Unified Approach for Schema Matching, Coreference, and Canonicalization.” In: Proceedings of the 14th ACM SIGKDD Conference (KDD-2008).
- (Hall et al., 2008) ⇒ Rob Hall, Charles Sutton, and Andrew McCallum. (2008). “Unsupervised Deduplication Using Cross-field Dependencies.” In: Proceedings of SIGKDD Conference (KDD-2008).
- (Druck et al., 2008) ⇒ Gregory Druck, Gideon Mann, and Andrew McCallum. (2008). “Learning from Labeled Features Using Generalized Expectation Criteria.” In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2008). doi:10.1145/1390334.1390436
2007
- (Culotta et al., 2007a) ⇒ Aron Culotta, Michael Wick, Robert Hall, and Andrew McCallum. (2007). “First-order probabilistic models for coreference resolution.” In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL 2007).
- (Culotta et al., 2007b) ⇒ Aron Culotta, Michael Wick, Robert Hall, Matthew Marzilli, and Andrew McCallum. (2007). “Canonicalization of Database Records using Adaptive Similarity Measures.” In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2007).
- (Sutton & McCallum, 2007) ⇒ Charles Sutton, and Andrew McCallum. (2007). “An Introduction to Conditional Random Fields for Relational Learning.” In: (Getoor & Taskar, 2007).
- (Mimno & McCallum, 2007) ⇒ David Mimno, and Andrew McCallum. (2007). “Expertise Modeling for Matching Papers with Reviewers.” In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. doi:10.1145/1281192.1281247
2006
- (Culotta et al., 2006) ⇒ Aron Culotta, Andrew McCallum, and Jonathan Betz. (2006). “Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text.” In: Proceedings of HLT-NAACL 2006.
- (Wick et al., 2006) ⇒ Michael Wick, Aron Culotta, and Andrew McCallum. (2006). “Learning Field Compatibilities to Extract Database Records from Unstructured Text.” In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2006).
- (McCallum, 2006) ⇒ Andrew McCallum. (2006). “Information Extraction, Data Mining and Joint Inference.” Invited Talk at KDD-2006.
- (Peng & McCallum, 2006) ⇒ Fuchun Peng, and Andrew McCallum. (2006). “Accurate Information Extraction from Research Papers using Conditional Random Fields.” In: Information Processing & Management, 42(4). doi:10.1016/j.ipm.2005.09.002
- (Wang & McCallum, 2006) ⇒ Xuerui Wang, and Andrew McCallum. (2006). “Topics Over Time: a non-Markov continuous-time model of topical trends.” In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2006). doi:10.1145/1150402.1150450
- (Mann et al., 2006) ⇒ Gideon S. Mann, David Mimno, and Andrew McCallum. (2006). “Bibliometric impact measures leveraging topic analysis”. In: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries (JCDL 2006). doi:10.1145/1141753.1141765
2005
- (McCallum, 2005) ⇒ Andrew McCallum. (2005). “Information Extraction: Distilling Structured Data from Unstructured Text.” In: ACM Queue, 3(9).
- (McCallum & Wellner, 2005) ⇒ Andrew McCallum, and Ben Wellner. (2005). “Conditional Models of Identity Uncertainty with Application to Noun Coreference.” In: NIPS17.
- (McCallum, Bellare & Pereira, 2005) ⇒ Andrew McCallum, Kedar Bellare, and Fernando Pereira. (2005). “A conditional random field for discriminatively-trained finite-state string edit distance.” In: Proceedings of the Conference on Uncertainty in AI (UAI 2005).
- (Bekkerman & McCallum, 2005) ⇒ Ron Bekkerman, and Andrew McCallum. (2005). “Disambiguating Web Appearance of People in a Social Network.” In: Proceedings of the 14th International World Wide Web Conference. (WWW 2005). doi:10.1145/1060745.1060813
- (Culotta et al., 2005) ⇒ Aron Culotta, D. Kulp, and Andrew McCallum. (2005). “Gene Prediction with Conditional Random Fields.” University of Massachusetts, Amherst, Tech. Rep. UM-CS-2005-028.
- (Culotta & McCallum, 2005) ⇒ Aron Culotta, and Andrew McCallum. (2005). “Reducing Labeling Effort for Structured Prediction Tasks.” In: Proceedings of the 20th national conference on Artificial intelligence (AAAI 2005).
- (Culotta & McCallum, 2005) ⇒ Aron Culotta, and Andrew McCallum. (2005). “Reducing Labeling Effort for Structured Prediction Tasks.” In: Proceedings of the 20th national conference on Artificial intelligence (AAAI 2005).
- (Sutton & McCallum, 2005b) ⇒ Charles Sutton, and Andrew McCallum. (2005). ."Piecewise Training of Undirected Models.” In: 21st Conference on Uncertainty in Artificial Intelligence, (UAI 2005).
- (Sutton & McCallum, 2005a) ⇒ Charles Sutton, and Andrew McCallum. (2005). ."Joint Parsing and Semantic Role Labeling.” In: Proceedings of the Ninth Conference on Computational Natural Language Learning (CONLL 2005).
2004
- (McCallum et al., 2004) ⇒ Andrew McCallum, A. Corrada Emmanuel, and X. Wang. “The author-recipient-topic model for topic and role discovery in social networks.” Technical Report UM-CS-2004-096, Department of Computer Science, University of Massachusetts, 2004.
- (McCallum & Sutton, 2004) ⇒ Andrew McCallum, and Charles Sutton. (2004). “Piecewise Training with Parameter Independence Diagrams: Comparing Globally- and Locally-trained Linear-chain CRFs.” In: NIPS 2004 Workshop on Learning with Structured Outputs.
- (Culotta & McCallum, 2004) ⇒ Aron Culotta, and Andrew McCallum. (2004). “Confidence Estimation for Information Extraction.” In: Proceedings of HLT-NAACL (NAACL 2004).
- (Wellner et al., 2004) ⇒ Ben Wellner, Andrew McCallum, Fuchun Peng, and Michael Hay. (2004). “An Integrated, Conditional Model of Information Extraction and Coreference with Application to Citation Matching.” In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI 2004).
- (Peng & McCallum, 2004) ⇒ Fuchun Peng, and Andrew McCallum. (2004). “Accurate Information Extraction from Research Papers using Conditional Random Fields.” In: Proceedings of the Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2004).
- (Culotta et al., 2004) ⇒ Aron Culotta, Ron Bekkerman, and Andrew McCallum. (2004). “Extracting Social Networks and Contact Information from Email and the Web.” In: Proceedings of the Conference on Email and Spam (CEAS 2004).
- (McCallum & Wellner, 2004) ⇒ Andrew McCallum, and Ben Wellner. (2004). “Conditional models of identity uncertainty with applications to noun coreference.” In: Neural Information Processing Systems.
- (Kristjansson, 2004) ⇒ Trausti Kristjansson, Aron Culotta, Paul Viola, and Andrew McCallum. (2004). “Interactive Information Extraction with Constrained Conditional Random Fields.” In: Proceedings of the 19th national conference on Artifical Intelligence (AAAI 2004).
2003
- (McCallum & Jensen, 2003) ⇒ Andrew McCallum, and David Jensen. (2003). “A Note on the Unification of Information Extraction and Data Mining using Conditional-Probability, Relational Models.” In: Proceedings of the IJCAI03 Workshop on Learning Statistical Models from Relational Data.
- (McCallum, 2003) ⇒ Andrew McCallum. (2003). “Efficiently Inducing Features of Conditional Random Fields.” In: Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence.
- (McCallum & Li, 2003) ⇒ Andrew McCallum and Wei Li. (2003). “Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons.” In: Proceedings of Seventh Conference on Natural Language Learning (CoNLL), 2003.
- (Pinto et al., 2003) ⇒ David Pinto, Andrew McCallum, Xing Wei, and W. Bruce Croft. (2003). “Table Extraction Using Conditional Random Fields.” In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2003). doi:10.1145/860435.860479
2002
- (McCallum, 2002) ⇒ Andrew McCallum. (2002). “MALLET: A Machine Learning for Language Toolkit.” http://mallet.cs.umass.edu.
2001
- (Lafferty et al., 2001) ⇒ John D. Lafferty, Andrew McCallum, and Fernando Pereira. (2001). “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data.” In: Proceedings of ICML 2001.
2000
- (McCallum et al., 2000a) ⇒ Andrew McCallum, Dayne Freitag, and Fernando Pereira. (2000). “Maximum Entropy Markov Models for Information Extraction and Segmentation.” In: Proceedings of ICML 2000.
- (McCallum et al., 2000b) ⇒ Andrew McCallum, Kamal Nigam, and Lyle H. Ungar. (2000). “Efficient Clustering of High-dimensional Data Sets with Application to Reference Matching.” In: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000).
- (McCallum et al., 2000c) ⇒ Andrew McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. (2000). “Automating the Construction of Internet Portals with Machine Learning.” In: Information Retrieval, 3(2). (doi:10.1023/A:1009953814988).
- (Nigam et al., 2000) ⇒ Kamal Nigam, Andrew McCallum, Sebastian Thrun, and Tom M. Mitchell. (2000). “Text Classification from Labeled and Unlabeled Documents Using EM.” In: Machine Learning, 39(2/3). doi:10.1023/A:1007692713085
- (Cohen et al., 2000) ⇒ William W. Cohen, Andrew McCallum, and D. Quass. (2000). “Learning to Understand the Web.” In: Bulletin of the IEEE Computer Society Technical Committee on Data Engineering.
1999
- (McCallum, 1999) ⇒ Andrew McCallum. (1999). “Multi-label Text Classification with a Mixture Model Trained by EM.” In: AAAI 99 Workshop on Text Learning.
- (Freitag & McCallum, 1999) ⇒ Dayne Freitag, and Andrew McCallum. (1999). “Information Extraction with HMMs and Shrinkage.” AAAI'99 Workshop on Machine Learning for Information Extraction.
- (Nigam et al., 1999) ⇒ Kamal Nigam, John Lafferty, and Andrew McCallum. (1999). “Using Maximum Entropy for Text Classification.” In: IJCAI-99 workshop on machine learning for information filtering.
1998
- (McCallum & Nigam, 1998) ⇒ Andrew McCallum, and Kamal Nigam. (1998). “A Comparison of Event Models for Naive Bayes Text Classification.” In: Proceedings of the AAAI/ICML-98 Workshop on Learning for Text Categorization.
- (Craven et al., 1998) ⇒ Mark Craven, Dan DiPasquo, Dayne Freitag, Andrew McCallum, Tom Mitchell, Kamal Nigam, and Sean Slattery (1998). “Learning to Extract Symbolic Knowledge from the World Wide Web.” In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI 1998).
- (Baker & McCallum, 1998) ⇒ L. Douglas Baker, and Andrew McCallum. (1998). “Distributional Clustering of Words for Text Classification.” In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ISBN:1-58113-015-5 doi:10.1145/290941.290970
1997
- (Craven et al., 1997) ⇒ Mark Craven, Dayne Freitag, Andrew McCallum, Tom M. Mitchell, Kamal Nigam, and C.Y. Quek. (1997). “Learning to Extract Symbolic Knowledge from the World Wide Web.” Technical report, Carnegie Mellon University.