Title :
Equi-Width Data Swapping for Private Data Publication
Author :
Li, Yidong ; Shen, Hong
Author_Institution :
Sch. of Comput. Sci., Univ. of Adelaide, Adelaide, SA, Australia
Abstract :
Data Swapping is a popular value-invariant data perturbation technique. The quality of a data swapping method is measured by how well it preserves data privacy and data utility. As swapping data globally is computationally impractical, to guarantee its performance in these metrics appropriate, localization schemes are often conducted in advance. Equi-depth partitioning is preferred by most of the existing data perturbation techniques as it provides uniform privacy protection for each data tuple. However, this method performs ineffectively for two types of applications: one is to maintain statistics based on equi-width partitioning, such as the multivariate histogram with equal bin width, and the other is to preserve parametric statistics, such as covariance, in the context of sparse data with non-uniform distribution. As a natural solution for the above application, this paper explores the possibility of using data swapping with equi-width partitioning for private data publication, which has been little used in data perturbation due to the difficulty of preserving data privacy. With extensive theoretical analysis and experimental results, we show that, Equi-Width Swapping (EWS)can achieve a similar performance in privacy preservation to that of Equi-Depth Swapping (EDS) if the number of partitions is sufficiently large (e. g. ¿ = ¿N, where N is the size of dataset). Our experimental results in both synthetic and real-world data validate our theoretical analysis.
Keywords :
data privacy; perturbation techniques; statistics; data privacy; data utility; equi-depth partitioning; equi-width data swapping; privacy preservation; private data publication; statistics; value-invariant data perturbation; Application software; Australia; Computer science; Data privacy; Distributed computing; Histograms; Parametric statistics; Perturbation methods; Protection; Statistical distributions; Privacy preserving data mining; data publication; data swapping; equi-width partitioning;
Conference_Titel :
Parallel and Distributed Computing, Applications and Technologies, 2009 International Conference on
Conference_Location :
Higashi Hiroshima
Print_ISBN :
978-0-7695-3914-0
DOI :
10.1109/PDCAT.2009.69