DocumentCode :
1787136
Title :
A Benchmark of Globally-Optimal Anonymization Methods for Biomedical Data
Author :
Prasser, Fabian ; Kohlmayer, Florian ; Kuhn, Klaus A.
Author_Institution :
Biomed. Inf. Dept. of Med., Tech. Univ. Munchen, Munich, Germany
fYear :
2014
fDate :
27-29 May 2014
Firstpage :
66
Lastpage :
71
Abstract :
Collaboration and data sharing have become core elements of biomedical research. At the same time, there is a growing understanding of privacy threats related to data sharing, especially when sensitive data from distributed sources become available for linkage. Statistical disclosure control comprises well-known data anonymization techniques that allow the protection of data by introducing fuzziness. To protect datasets from different types of threats, different privacy criteria are commonly implemented. Data anonymization is an important measure, but it is computationally complex, and it can significantly reduce the expressiveness of data. To attenuate these problems, a number of algorithms has been proposed, which aim at increasing data quality or improving efficiency. Previous evaluations of such algorithms lack a systematic approach, as they focus on specific algorithms, specific privacy criteria, and specific runtime environments. Therefore, it is difficult for decision makers to decide which algorithm is best suited for their requirements. As a first step towards a comprehensive and systematic evaluation of anonymity algorithms, we report on our ongoing efforts for providing an open source benchmark. In this contribution, we focus on optimal algorithms utilizing global recoding with full-domain generalization. We present a systematic evaluation of domain-specific algorithms and generic search methods for a broad set of privacy criteria, including k-anonymity, l-diversity, t-closeness and d-presence, and their use in multiple real-world datasets. Our results show that there is no single solution fitting all needs, and that generic search methods can outperform highly specialized algorithms.
Keywords :
data privacy; medical information systems; δ-presence; I-diversity; biomedical data; data anonymization; data quality; domain-specific algorithms; full-domain generalization; generic search methods; global recoding; globally-optimal anonymization methods; k-anonymity; privacy criteria; t-closeness; Benchmark testing; Biomedical measurement; Data privacy; Lattices; Prediction algorithms; Privacy; Tagging; δ -presence; benchmark; de-identification; k-anonymity; l-diversity; privacy; statistical disclosure control; t-closeness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer-Based Medical Systems (CBMS), 2014 IEEE 27th International Symposium on
Conference_Location :
New York, NY
Type :
conf
DOI :
10.1109/CBMS.2014.85
Filename :
6881850
Link To Document :
بازگشت