DocumentCode :
3706619
Title :
An HL7 Data Pseudonymization Pipeline
Author :
Thusitha Mabotuwana;Christopher S. Hall;Rob van Ommering;Ranjith Tellis;Merlijn Sevenster
Author_Institution :
Philips Healthcare, New York, NY, USA
fYear :
2015
Firstpage :
303
Lastpage :
309
Abstract :
The increasing uptake of information technology in the healthcare domain has resulted in a large volume of digital health data being generated on a regular basis. Most of the health information systems exchange information using HL7 messages making HL7 a good source for data collection. Despite the large volume of data that is generated within the treatment facilities, sharing this data for research and development purposes is not straightforward due to patient privacy protection legislation. In this paper we present an HL7 data processing pipeline that extracts information from complex HL7 data structures, de-identifies fields containing identifiable patient information, and reconstructs HL7 messages using the de-identified data, while providing a secure mechanism for re-identification. Using a dataset of 149,647 production HL7 messages, we demonstrate how the system has over 99% accuracy, suggesting that this is a feasible approach to de-identify patient data.
Keywords :
"Data mining","Hospitals","Pipelines","Data processing","Production","Standards"
Publisher :
ieee
Conference_Titel :
Healthcare Informatics (ICHI), 2015 International Conference on
Type :
conf
DOI :
10.1109/ICHI.2015.43
Filename :
7349704
Link To Document :
بازگشت