DocumentCode
3706619
Title
An HL7 Data Pseudonymization Pipeline
Author
Thusitha Mabotuwana;Christopher S. Hall;Rob van Ommering;Ranjith Tellis;Merlijn Sevenster
Author_Institution
Philips Healthcare, New York, NY, USA
fYear
2015
Firstpage
303
Lastpage
309
Abstract
The increasing uptake of information technology in the healthcare domain has resulted in a large volume of digital health data being generated on a regular basis. Most of the health information systems exchange information using HL7 messages making HL7 a good source for data collection. Despite the large volume of data that is generated within the treatment facilities, sharing this data for research and development purposes is not straightforward due to patient privacy protection legislation. In this paper we present an HL7 data processing pipeline that extracts information from complex HL7 data structures, de-identifies fields containing identifiable patient information, and reconstructs HL7 messages using the de-identified data, while providing a secure mechanism for re-identification. Using a dataset of 149,647 production HL7 messages, we demonstrate how the system has over 99% accuracy, suggesting that this is a feasible approach to de-identify patient data.
Keywords
"Data mining","Hospitals","Pipelines","Data processing","Production","Standards"
Publisher
ieee
Conference_Titel
Healthcare Informatics (ICHI), 2015 International Conference on
Type
conf
DOI
10.1109/ICHI.2015.43
Filename
7349704
Link To Document