DocumentCode :
3267733
Title :
Privacy Protecting by Multiattribute Clustering in Data-Intensive Service
Author :
Zhu, Qing ; Li, Ning
Author_Institution :
Dept. of Comput. Sci., Renmin Univ. of China, Beijing, China
fYear :
2012
fDate :
25-27 June 2012
Firstpage :
1273
Lastpage :
1278
Abstract :
With the explosive growth of big data, organizations are strongly encouraged to release their micro-data to support data-intensive analysis services, to provide new business opportunities and to allow every kind of scientific study as well. However, releasing medical records about individuals violates their privacy thus, privacy-preserving data publishing has become a critical issue for companies and organizations. Existing privacy protection anonymous technique mainly conducts operation directing at quasi-identifier attributes without consideration of specific relation between different values of sensitive attribute, which results in revealing of individual privacy information. The paper conducts detailed research in allusion to correlation between valuing of sensitive attribute, carries forward the idea of conducting protection to initial data by lossy join, and proposes Twice-privacy algorithm based on utility matrix and multiattribute clustering. Twice-privacy conducts a clustering of sensitive values to protect similarity, sets different weight to retain quasi-identifier attribute to query service; data obtained by clustering algorithm are of high accuracy and high value. Experimental results on real datasets show the effectiveness and efficiency of Twice-privacy algorithm. Our solutions reduce the similarity attack rate to 0%. Meanwhile, the query correction rate and analysis correction rate of the proposed have obvious promotion, inquire accuracy and analysis accuracy are also enhance.
Keywords :
data privacy; electronic publishing; medical information systems; pattern clustering; query processing; analysis correction rate; business opportunities; data-intensive service; individual privacy information; medical records; multiattribute clustering; privacy protection; privacy protection anonymous technique; privacy-preserving data publishing; quasiidentifier attributes; query correction rate; query service; similarity protection; twice-privacy algorithm; utility matrix; Accuracy; Algorithm design and analysis; Clustering algorithms; Data privacy; Diseases; Privacy; Vectors; Clustering algorithm; Data-intensive computing; Privacy preserving;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Trust, Security and Privacy in Computing and Communications (TrustCom), 2012 IEEE 11th International Conference on
Conference_Location :
Liverpool
Print_ISBN :
978-1-4673-2172-3
Type :
conf
DOI :
10.1109/TrustCom.2012.224
Filename :
6296125
Link To Document :
بازگشت