Title :
Evaluation of information loss for privacy preserving data mining through comparison of fuzzy partitions
Author :
Cano, Isaac ; Ladra, Susana ; Torra, Vicenç
Author_Institution :
Artificial Intell. Res. Inst., Spanish Nat. Res. Council, Bellaterra, Spain
Abstract :
In this paper, we focus on the problem of preserving the data confidentiality when sharing the data for clustering. This problem poses new challenges for novel uses of privacy preserving data mining (PPDM) techniques. Specifically, this paper considers the synthetic data generation as a way to preserve the data privacy. One of the state of the art synthetic data generators is the IPSO family of methods. It has been stated that the use of IPSO to generate synthetic data is appropriate when the user plans to apply clustering to the data. Moreover, this paper aims to associate the same property to the FCRM synthetic data generator, and at the same time, to assess the relationship between the information loss produced when generating synthetic data with FCRM and the clustering similarity between the original and synthetic data.
Keywords :
data mining; data privacy; fuzzy set theory; pattern clustering; PPDM; fuzzy partitions; information loss evaluation; privacy preserving data mining; synthetic data generation; Clustering algorithms; Data models; Data privacy; Generators; Indexes; Loss measurement;
Conference_Titel :
Fuzzy Systems (FUZZ), 2010 IEEE International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6919-2
DOI :
10.1109/FUZZY.2010.5584186