DocumentCode
2741348
Title
Privacy Preserving Clustering by Cluster Bulging for Information Sustenance
Author
Kadampur, Mohammad Ali ; Somayajulu, D.V.L.N. ; Dhiraj, S. S Shivaji ; Satyam, Shailesh G P
Author_Institution
Dept. of Comput. Sci. & Eng., Nat. Inst. of Technol., Warangal
fYear
2008
fDate
12-14 Dec. 2008
Firstpage
240
Lastpage
246
Abstract
Cluster analysis is a data mining approach for unsupervised learning. However, the use of clustering as a data mining tool has been a cause of growing concern as the use of this technology is violating individual privacy. This paper presents a method for privacy preserving clustering through cluster bulging. In this method, the objects of the database are first aligned into clusters based on a similarity measure. The data in these clusters is perturbed in a controlled manner by modifying the values of various objects, so that, in the perturbed data set, the clusters are bulged in comparison to those in the original data set. In order to perform this perturbation, every cluster is displaced along the line joining its centroid to the centroid of the whole data set. And, then, every object in each cluster is shifted along the line joining that object to the centroid of the cluster. The word bulging used here refers to both positive and negative bulging. The method in essence manipulates the similarity measures and recomputes the new perturbed objects of the respective clusters. Thus, every object in the bulged cluster represents its corresponding object from the original cluster. After the application of this method, the objects get perturbed, while the number of member objects and shape of each cluster remain the same as those of the original clusters, thereby the information in the two instances of the data sets is sustained, while, the privacy of sensitive data is preserved.
Keywords
data mining; data privacy; database management systems; unsupervised learning; cluster analysis; cluster bulging; data mining; database objects; information sustenance; perturbation; perturbed objects; privacy preserving clustering; sensitive data; similarity measure; unsupervised learning; Algorithm design and analysis; Clustering algorithms; Computer science; Data analysis; Data mining; Data privacy; Databases; Information analysis; Shape; Unsupervised learning; cluster analysis; data mining; data perturbation; information revealing; privacy preservation;
fLanguage
English
Publisher
ieee
Conference_Titel
Information and Automation for Sustainability, 2008. ICIAFS 2008. 4th International Conference on
Conference_Location
Colombo
Print_ISBN
978-1-4244-2899-1
Electronic_ISBN
978-1-4244-2900-4
Type
conf
DOI
10.1109/ICIAFS.2008.4783947
Filename
4783947
Link To Document