A K-anonymity clustering algorithm based on the information entropy

Author

Jianpei Zhang ; Ying Zhao ; Yue Yang ; Jing Yang

Author_Institution

Coll. of Comput. Sci. & Technol., Harbin Eng. Univ., Harbin, China

fYear

2014

fDate

21-23 May 2014

Firstpage

319

Lastpage

324

Abstract

Data anonymization techniques are the main way to achieve privacy protection, and as a classical anonymity model, K-anonymity is the most effective and frequently-used. But the majority of K-anonymity algorithms can hardly balance the data quality and efficiency, and ignore the privacy of the data to improve the data quality. To solve the problems above, by introducing the concept of “diameter” and a new clustering criterion based on the parameter of the maximum threshold of equivalence classes, we proposed a K-anonymity clustering algorithm based on the information entropy. The results of experiments showed that both the algorithm efficiency and data security are improved, and meanwhile the total information loss is acceptable, so the proposed algorithm has some practicability in application.

Keywords

data privacy; entropy; pattern clustering; security of data; K-anonymity clustering algorithm; classical anonymity model; data anonymization techniques; data efficiency; data quality improvement; data security; information entropy; maximum equivalence class threshold; privacy protection; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data security; Entropy; Information entropy; Loss measurement; K-anonymity; clustering; information entropy; privacy preserving;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Supported Cooperative Work in Design (CSCWD), Proceedings of the 2014 IEEE 18th International Conference on

Conference_Location

Hsinchu

Type

conf

DOI

10.1109/CSCWD.2014.6846862

Filename

6846862