Title :
On Data Distortion for Privacy Preserving Data Mining
Author :
Kabir, Saif M A ; Youssef, Amr M. ; Elhakeem, Ahmed K.
Author_Institution :
Concordia Univ., Montreal
Abstract :
Because of the increasing ability to trace and collect large amount of personal information, privacy preserving in data mining applications has become an important concern. Data perturbation is one of the well known techniques for privacy preserving data mining. The objective of these data perturbation techniques is to distort the individual data values while preserving the underlying statistical distribution properties. Theses data perturbation techniques are usually assessed in terms of both their privacy parameters as well as its associated utility measure. While the privacy parameters present the ability of these techniques to hide the original data values, the data utility measures assess whether the dataset keeps the performance of data mining techniques after the data distortion. In this paper, we investigate the use of truncated non-negative matrix factorization (NMF) with sparseness constraints for data perturbation.
Keywords :
data mining; data privacy; matrix decomposition; security of data; sparse matrices; statistical distributions; data distortion; data perturbation; data utility measure; personal information; privacy preserving data mining; sparseness constraints; statistical distribution; truncated nonnegative matrix factorization; Application software; Data engineering; Data mining; Data privacy; Distortion measurement; Euclidean distance; Information systems; Perturbation methods; Statistical distributions; Systems engineering and theory;
Conference_Titel :
Electrical and Computer Engineering, 2007. CCECE 2007. Canadian Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
1-4244-1020-7
Electronic_ISBN :
0840-7789
DOI :
10.1109/CCECE.2007.83