DocumentCode :
1622264
Title :
Generation of synthetic data by means of fuzzy c-Regression
Author :
Cano, Isaac ; Torra, Vicenc
Author_Institution :
IIIA (Artificial Intell. Res. Inst.), Spanish Nat. Res. Council, Bellaterra, Spain
fYear :
2009
Firstpage :
1145
Lastpage :
1150
Abstract :
Problems related to data privacy are studied in the areas of privacy preserving data mining (PPDM) and statistical disclosure control (SDC). Their goal is to avoid the disclosure of sensitive or proprietary information to third parties. In this paper a new synthetic data generation method is proposed and the information loss and disclosure risk are measured. The method is based on fuzzy techniques. Informally, a fuzzy c-regression method is applied to the original data set and synthetic data is released with an appropriate information loss and disclosure risk depending on c. As other data protection methods do, our synthetic data generation procedure allows third parties to do some statistical computations with a limited risk of disclosure. The trade-off between data utility and data safety of our proposed method will be assessed.
Keywords :
data mining; data privacy; fuzzy set theory; pattern clustering; regression analysis; risk analysis; security of data; data privacy; data protection; data safety; data utility; disclosure risk; fuzzy c-regression; information loss; privacy preserving data mining; proprietary information; sensitive information; statistical disclosure control; synthetic data generation; Couplings; Data models; Data privacy; Fuzzy control; Fuzzy sets; Loss measurement; Pervasive computing; Protection; Safety; Statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems, 2009. FUZZ-IEEE 2009. IEEE International Conference on
Conference_Location :
Jeju Island
ISSN :
1098-7584
Print_ISBN :
978-1-4244-3596-8
Electronic_ISBN :
1098-7584
Type :
conf
DOI :
10.1109/FUZZY.2009.5277074
Filename :
5277074
Link To Document :
بازگشت