Title :
Establishing a benchmark for re-identification methods and its validation using fuzzy clustering
Author :
Torra, Vicenç ; Domingo-Ferrer, Josep
Author_Institution :
IIIA-CSIC, Bellaterra
Abstract :
Privacy preserving data mining and statistical disclosure control are related fields with increasing importance nowadays. They aim is to allow the publication of sensible data without compromising the privacy of data respondents. To that end, masking methods have been designed so that data are distorted in a way that preserves confidentiality and data utility. Alternatively, methods have been constructed to generate synthetic data that have properties similar to the ones of the original data. At the same time, recent research in re-identification methods (record and variable matching) has been pushed forward due to the current interest on security issues and the huge amount of data stored in databases. However, there is no standard methodology for comparing alternative re-identification methods. In this paper we propose the use of masking methods and synthetic data generators for building benchmarks for matching methods. We validate our approach using fuzzy clustering.
Keywords :
data mining; data privacy; fuzzy set theory; pattern clustering; pattern matching; data mining; data privacy; data storage; fuzzy clustering; masking method; re-identification method; record matching; security issue; statistical disclosure control; syntethic data generator; Couplings; Data mining; Data privacy; Data security; Databases; Design methodology; Digital signal processing; Explosions; Fuzzy control; Stress;
Conference_Titel :
Fuzzy Systems, 2006 IEEE International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
0-7803-9488-7
DOI :
10.1109/FUZZY.2006.1681816