DocumentCode
1432563
Title
Anonymization of Longitudinal Electronic Medical Records
Author
Tamersoy, Acar ; Loukides, Grigorios ; Nergiz, Mehmet Ercan ; Saygin, Yucel ; Malin, Bradley
Author_Institution
Dept. of Biomed. Inf., Vanderbilt Univ., Nashville, TN, USA
Volume
16
Issue
3
fYear
2012
fDate
5/1/2012 12:00:00 AM
Firstpage
413
Lastpage
423
Abstract
Electronic medical record (EMR) systems have enabled healthcare providers to collect detailed patient information from the primary care domain. At the same time, longitudinal data from EMRs are increasingly combined with biorepositories to generate personalized clinical decision support protocols. Emerging policies encourage investigators to disseminate such data in a deidentified form for reuse and collaboration, but organizations are hesitant to do so because they fear such actions will jeopardize patient privacy. In particular, there are concerns that residual demographic and clinical features could be exploited for reidentification purposes. Various approaches have been developed to anonymize clinical data, but they neglect temporal information and are, thus, insufficient for emerging biomedical research paradigms. This paper proposes a novel approach to share patient-specific longitudinal data that offers robust privacy guarantees, while preserving data utility for many biomedical investigations. Our approach aggregates temporal and diagnostic information using heuristics inspired from sequence alignment and clustering methods. We demonstrate that the proposed approach can generate anonymized data that permit effective biomedical analysis using several patient cohorts derived from the EMR system of the Vanderbilt University Medical Center.
Keywords
data privacy; decision support systems; medical information systems; protocols; statistical analysis; anonymization; biomedical analysis; biomedical research; biorepositories; clinical features; clustering method; data utility; heuristics; longitudinal electronic medical records; patient information; patient privacy; patient-specific longitudinal data; personalized clinical decision support protocols; primary care domain; reidentification; residual demographic features; robust privacy guarantees; sequence alignment; temporal information; Bioinformatics; DNA; Data privacy; Educational institutions; Measurement; Medical diagnostic imaging; Trajectory; Anonymization; data privacy; electronic medical records (EMRs); longitudinal data; Algorithms; Cluster Analysis; Cohort Studies; Confidentiality; Database Management Systems; Electronic Health Records; Humans;
fLanguage
English
Journal_Title
Information Technology in Biomedicine, IEEE Transactions on
Publisher
ieee
ISSN
1089-7771
Type
jour
DOI
10.1109/TITB.2012.2185850
Filename
6140575
Link To Document