2008 RecordLinkageSoftwareInThePublicDomain
Jump to navigation
Jump to search
- (Campbell et al., 2008) ⇒ Kevin M. Campbell, Dennis Deck, Antoinette Krupski. (2008). “Record linkage software in the public domain: a comparison of Link Plus, The Link King, and a `basic' deterministic algorithm.” In: [[journal::Health Informatics Journal](1).
Subject Headings: RP Link Plus System, The Link King System, Deterministic Duplicate Record Detection Algorithm, Probabilistic Duplicate Record Detection Algorithm, Person Record Deduplication Task.
Notes
- Reports an Empirical Tests of the Performance of three Person Record Linkage System: RP Link Plus System, The Link King System, and a Deterministic Record Linkage Algorithm.
- The Benchmark Dataset is the client Person Record Set of Washington State’s Division of Alcohol and Substance Abuse (DASA).
- DASA’s client database contains over 600,000 Person Records each with a “ClientID”.
- Approximately 26 per cent of Person Records are Duplicate Records.
- Person Record Data Attributes include: Person Last Name, Person First Name, Person Middle Name, Birth Year, Birth Month, Birth Day, SSN, Person Sex, and Person Race.
Quotes
Abstract
- The study objective was to compare the accuracy of a deterministic record linkage algorithm and two public domain software applications for record linkage (The Link King and Link Plus). The three algorithms were used to unduplicate an administrative database containing personal identifiers for over 500,000 clients. Subsequently, a random sample of linked records was submitted to four research staff for blinded clerical review. Using reviewers' decisions as the `gold standard', sensitivity and positive predictive values (PPVs) were estimated. Optimally, sensitivity and PPVs in the mid 90s could be obtained from both The Link King and Link Plus. Sensitivity and PPVs using a basic deterministic algorithm were 79 and 98 per cent respectively. Thus the full feature set of The Link King makes it an attractive option for SAS users. Link Plus is a good choice for non-SAS users as long as necessary programming resources are available for processing record pairs identified by Link Plus.
References
- 1 Gill L, Goldacre M, Simmons H, Bettley G, Griffith M. Computerized linking of medical records: methodological guidelines. Journal of Epidemiology & Community Health 1993; 47; 316–19.
- 2 Jaro M. Probabilistic linkage of large public health data files. Statistics in Medicine 1995; 14; 491–8.
- 3 Whalen D, Pepitone A, Graver L, Busch J. Linking Client Records from Substance Abuse, Mental Health, and Medicaid State Agencies. Rockville, MD: Substance Abuse and Mental Health Services Administration, 2001.
- 4 Gomatam S, Carter R, ArieTom M. Mitchell G. An empirical comparison of record linkage procedures. Statistics in Medicine 2002; 21; 1485–96.
- 5 Clark D. Practical introduction to record linkage for injury research. Injury Prevention 2004; 10 (3); 186–91.
- 6 Campbell K, Deck D, Cox C, Broderick C. The Link King User Manual (online) (2005). www.the-linkking. com\user_manual.zip, 3 November 2006.
- 7 Christen P, Goiser K. Quality and Complexity Measures for Data Linkage and Deduplication (online) (2006). http://cs.anu.edu.au/people/Peter.Christen/publications/qmdm-linkage.pdf, 3 November 2006.
- 8 Newcombe H, Kennedy J, Axford S, James A. Automatic linkage of vital records. Science 1959; 130; 954–9.
- 9 Weiner M, Stump T, Callahan C, Lewis J, McDonald C. A practical method of linking data from Medicare claims and a comprehensive electronic medical records system. International Journal of Medical Informatics 2003; 71 (1); 57–69.
- 10 Grannis S, Overhage J, McDonald C. Analysis of identifi er performance using a deterministic linkage algorithm. In: Proceedings of the American Medical Informatics Association Symposium. Philadelphia: Hanley and Belfus, 2002.
- 11 Gomatam S, Carter R. A Computerized Stepwise Deterministic Strategy for Record Linkage. University of Florida Technical Report 615, 1999.
- 12 Kendrick S, Douglas M, Gardner D, Hucker D. Best-link matching of Scottish health data sets. Methods of Information in Medicine 1998; 37 (1); 64–8.
- 13 Wajda A, Roos L, Layefsky M, Singleton J. Record linkage strategies. Part II: Portable software and deterministic matching. Methods of Information in Medicine 1991; 30; 210–14.
- 14 Jones L, Sujansky W. Patient Data Matching Software: A Buyer’s Guide for the Budget Conscious. California Health Care Foundation, 2004.
- 15 Contiero P, Tittarelli A, Tagliabue G, Maghini A, Fabiano S, Crosignani P, Tessandori R. The EpiLink record linkage software. Methods of Information in Medicine 2005; 44 (1); 66–71.
- 16 Dal Maso L, Braga C, Franceschi S. Methodology used for software for automated linkage in Italy (SALI). Journal of Biomedical Informatics 2001; 34; 387–95.
,