DocumentCode :
2877621
Title :
An Indexed Bottom-up Approach for Publishing Anonymized Data
Author :
Anh-Tu Hoang ; Minh-Triet Tran ; Anh-Duc Duong ; Echizen, Isao
Author_Institution :
Univ. of Sci., Ho Chi Minh City, Vietnam
fYear :
2012
fDate :
17-18 Nov. 2012
Firstpage :
641
Lastpage :
645
Abstract :
Sharing information is one of the most important parts of social activities. However, sharing information can leak users´ information. Removing all direct identifiers is not enough. Sweeney proposed an approach that applying k-anonymity to protect users´ identities from linking attack. Sweeney`s algorithm finds out the optimal anonymized dataset through minimal distortion metric. Other authors proposed other optimal algorithms but their proposals are still impractical due to their high computational cost. Another approach is to release the minimal anonymized dataset by applying some heuristics. Wang and Fung proposed Bottom-up Generalization and Top-down Specialization (TDS) to publish a minimal anonymized dataset with information loss metric, whose performance is more efficient. However, these algorithms still have some limitations. In this paper, we propose an algorithm to publish anonymized datasets through bottom-up generalization approach and information loss data metric. Our algorithm can save time by storing statistical information for later usage. The experimental results is performanced on Adult dataset, which is used in all former algorithms. Experimental results show that our algorithm can process 949,662 records dataset in 42.219s. Classification error on anonymized data, which is created by our algorithm, is lower than Wang´s algorithm 3.8%.
Keywords :
electronic publishing; peer-to-peer computing; security of data; statistical analysis; Sweeney`s algorithm; TDS; Wang algorithm; adult dataset; anonymized data Classification error; anonymized data publishing; fung proposed bottom-up generalization; high computational cost; indexed bottom-up approach; information loss data metric; information sharing; k-anonymity; linking attack; minimal distortion metric; optimal anonymized dataset; social activities; statistical information storage; top-down specialization; users identities protection; Data models; Data privacy; Measurement; Partitioning algorithms; Publishing; Taxonomy; Training; Bottom-up; anonymized data; k-anonymity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Security (CIS), 2012 Eighth International Conference on
Conference_Location :
Guangzhou
Print_ISBN :
978-1-4673-4725-9
Type :
conf
DOI :
10.1109/CIS.2012.148
Filename :
6405918
Link To Document :
بازگشت