Record Linkage Algorithm
Jump to navigation
Jump to search
A Record Linkage Algorithm is a co-reference resolution algorithm that can solve a co-referent record resolution task.
- AKA: Coreferent Record Detection Algorithm, Coreference Record Resolution Algorithm.
- Context:
- It can range from being a Heuristic Coreferent Record Detection Algorithm to being a Data-Driven Coreferent Record Detection Algorithm (such as a supervised record coreference resolution algorithm).
- It can range from being a Domain-Independent Coreference Resolution Algorithm to being a Domain Specific Coreference Resolution Algorithm such as a person record duplicate retection algorithm).
- It can be applied by a Coreferent Record Resolution System (that can solve a coreferent record resolution task).
- It can range from being a Independent-Record Linkage Algorithm to being a Relational-Record Linkage Algorithm.
- It can range from being an Intra-Database Record Resolution Algorithm to being an Inter-Database Record Resolution Algorithm.
- Example(s):
- Counter-Example(s):
- See: Taxonomy Record Linkage Algorithm, Duplicate Record Detection Algorithm.
References
2007
- (Bhattacharya & Getoor, 2007) ⇒ Indrajit Bhattacharya, and Lise Getoor. (2007). “Collective Entity Resolution in Relational Data.” In: Proceedings for ACM Transactions on Knowledge Discovery from Data (TKDD 2007).
- (Elmagarmid et al., 2007) ⇒ Ahmed K. Elmagarmid, Panagiotis G. Ipeirotis, and Vassilios S. Verykios (2007). “Duplicate Record Detection: A Survey." IEEE Transactions on Knowledge and Data Engineering, 19(1).
2006
- (Bhattacharya & Getoor, 2006) ⇒ Indrajit Bhattacharya, and Lise Getoor. (2006). “A Latent Dirichlet Model for Unsupervised Entity Resolution.” In: Proceedings of the Sixth SIAM International Conference on Data Mining (SIAM 2006).
- Lise Getoor. (2006). “Entity Resolution in Relational Data." Presentation at Second International Workshop on Exchange and Integration of Data.
2005
- (Bilenko et al., 2005) ⇒ Mikhail Bilenko, Sugato Basu, and Mehran Sahami. (2005). “Adaptive Product Normalization: Using Online Learning for Record Linkage in Comparison Shopping.” In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM-2005).
- Campbell K, Deck D, Cox C, Broderick C. (2005). The Link King User Manual (online) (2005)., 3 November 2006.
- (Kalashnikov et al., 2005) ⇒ Dmitri V. Kalashnikov, Sharad Mehrotra, and Zhaoqi Chen. (2005). “Exploiting Relationships for Domain-Independent Data Cleaning.” In: Proceedings of the SIAM International Conference on Data Mining (SIAM SDM 2005)
- (Dong et al., 2005) ⇒ X Dong, A Halevy, and J Madhavan. (2005). “Reference Reconciliation in Complex Information Spaces.” In: Proceedings of the ACM SIGMOD Conference (SIGMOD 2005).
- (Chaudhuri et al., 2005) ⇒ Surajit Chaudhuri, Venkatesh Ganti, and Rajeev Motwani. (2005). “Robust Identification of Fuzzy Duplicates.” In: Proceedings of the 21st International Conference on Data Engineering (ICDE 2005).
2003
- (Bilenko & Mooney, 2003) ⇒ Mikhail Bilenko, and Raymond Mooney. (2003). “Adaptive Duplicate Detection Using Learnable String Similarity Measures.” In: Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003).
- (Jin et al., 2003) ⇒ Liang Jin, Chen Li, and Sharad Mehrotra. (2003). “Efficient Record Linkage in Large Data Sets.” In: Proceedings of the Eighth International Conference on Database Systems for Advanced Applications.
- (Weiner et al., 2003) ⇒ Weiner M, Stump T, Callahan C, Lewis J, McDonald C. A practical method of linking data from Medicare claims and a comprehensive electronic medical records system. International Journal of Medical Informatics 2003; 71 (1); 57–69.
2002
- Grannis S, Overhage J, and McDonald C. (2002). “Analysis of identifier performance using a deterministic linkage algorithm.” In: Proceedings of the American Medical Informatics Association Symposium. Philadelphia: Hanley and Belfus.
- Gomatam S, Carter R, ArieTom M. Mitchell G. (2002). “An Empirical Comparison of Record Linkage Procedures. Statistics in Medicine; 21; 1485–96.
2000
- (McCallum et al., 2000b) ⇒ Andrew McCallum, Kamal Nigam, and Lyle H. Ungar. (2000). “Efficient Clustering of High-dimensional Data Sets with Application to Reference Matching.” In: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ([[KDD] 2000).
- NOTES: It proposes an Unsupervised Algorithm.
1999
- Gomatam S, Carter R. (1999). “A Computerized Stepwise Deterministic Strategy for Record Linkage. University of Florida, Technical Report 615.