Title :
A privacy attack that removes the majority of the noise from perturbed data
Author_Institution :
Dept. of Comput. Eng. & Math., Rovira i Virgili Univ., Tarragona, Spain
Abstract :
Data perturbation is a sanitization method that helps restrict the disclosure of sensitive information from published data. We present an attack on the privacy of the published data that has been sanitized using data perturbation. The attack employs data mining to remove some noise from the perturbed sensitive values. Our attack is practical, can be launched by non-expert adversaries, and it does not require any background knowledge. Extensive experiments were performed on four databases derived from UCI´s Adult and IPUMS census-based data sets sanitized with noise addition that satisfies ε-differential privacy. The experimental results confirm that our attack presents a significant privacy risk to published perturbed data. The results show that up to 93% of the noise added during perturbation can be effectively removed using general-purpose data miners from the Weka software package. Interestingly, the higher the aimed privacy, the higher the percentage of noise can be removed. This suggests that adding more noise does not always increase the real privacy.
Keywords :
data mining; data privacy; security of data; software packages; ε-differential privacy; Weka software package; data mining; data perturbation; privacy attack; published data privacy; sanitization method; Data privacy; Databases; Estimation; Noise; Prediction algorithms; Privacy;
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6916-1
DOI :
10.1109/IJCNN.2010.5596527