DocumentCode
2963012
Title
Equi-Width Data Swapping for Private Data Publication
Author
Li, Yidong ; Shen, Hong
Author_Institution
Sch. of Comput. Sci., Univ. of Adelaide, Adelaide, SA, Australia
fYear
2009
fDate
8-11 Dec. 2009
Firstpage
231
Lastpage
238
Abstract
Data Swapping is a popular value-invariant data perturbation technique. The quality of a data swapping method is measured by how well it preserves data privacy and data utility. As swapping data globally is computationally impractical, to guarantee its performance in these metrics appropriate, localization schemes are often conducted in advance. Equi-depth partitioning is preferred by most of the existing data perturbation techniques as it provides uniform privacy protection for each data tuple. However, this method performs ineffectively for two types of applications: one is to maintain statistics based on equi-width partitioning, such as the multivariate histogram with equal bin width, and the other is to preserve parametric statistics, such as covariance, in the context of sparse data with non-uniform distribution. As a natural solution for the above application, this paper explores the possibility of using data swapping with equi-width partitioning for private data publication, which has been little used in data perturbation due to the difficulty of preserving data privacy. With extensive theoretical analysis and experimental results, we show that, Equi-Width Swapping (EWS)can achieve a similar performance in privacy preservation to that of Equi-Depth Swapping (EDS) if the number of partitions is sufficiently large (e. g. ¿ = ¿N, where N is the size of dataset). Our experimental results in both synthetic and real-world data validate our theoretical analysis.
Keywords
data privacy; perturbation techniques; statistics; data privacy; data utility; equi-depth partitioning; equi-width data swapping; privacy preservation; private data publication; statistics; value-invariant data perturbation; Application software; Australia; Computer science; Data privacy; Distributed computing; Histograms; Parametric statistics; Perturbation methods; Protection; Statistical distributions; Privacy preserving data mining; data publication; data swapping; equi-width partitioning;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Computing, Applications and Technologies, 2009 International Conference on
Conference_Location
Higashi Hiroshima
Print_ISBN
978-0-7695-3914-0
Type
conf
DOI
10.1109/PDCAT.2009.69
Filename
5372796
Link To Document