DocumentCode
3720272
Title
A novel approach of data sanitization by noise addition and knowledge discovery by clustering
Author
Hadi Abdullah;Ahsan Siddiqi;Fuad Bajaber
Author_Institution
University of Richmond, Richmond, USA
fYear
2015
Firstpage
1
Lastpage
9
Abstract
Security of published data cannot be less important as compared to unpublished data or the data which is not made public. Therefore, PII (Personally Identifiable Information) is removed and data sanitized when organizations recording large volumes of data publish that data. However, this approach of ensuring data privacy and security can result in loss of utility of that published data for knowledge discovery. Therefore, a balance is required between privacy and the utility needs of published data. In this paper we study this delicate balance by evaluating four data mining clustering techniques for knowledge discovery and propose two privacy/utility quantification parameters. We subsequently perform number of experiments to statistically identify which clustering technique is best suited with desirable level of privacy/utility while noise is incrementally increased by simultaneously degrading data accuracy, completeness and consistency.
Keywords
"Data privacy","Privacy","Databases","Knowledge discovery","Data security"
Publisher
ieee
Conference_Titel
Computer Networks and Information Security (WSCNIS), 2015 World Symposium on
Print_ISBN
978-1-4799-9906-4
Type
conf
DOI
10.1109/WSCNIS.2015.7368283
Filename
7368283
Link To Document