Title :
An Improved Method for Privacy Preserving Data Mining
Author :
Poovammal, E. ; Ponnavaikko, M.
Author_Institution :
Dept. of CSE, SRM Univ., Chennai
Abstract :
In the light of developments in technology to analyze personal data, public concerns regarding privacy are rising. Often a data holder, such as a hospital or bank needs to share person specific records in such a way that the identities of the individuals who are the subjects of data cannot be determined. The generalization techniques such as K-anonymous, L-diverse and t-closeness were given as solutions to solve the problem of privacy breach, at the cost of information loss. Also, a very few papers dealt with personalized generalization. But, all these methods were developed to solve the external linkage problem resulting in sensitive attribute disclosure. It is very easy to prevent sensitive attribute disclosure by simply not publishing quasi-identifiers and sensitive attributes together. But the only reason to publish generalized quasi identifiers and sensitive attributes together is to support data mining tasks that consider both types of attributes in the database. Our goal in this paper is to eliminate the privacy breach (how much an adversary learn from the published data) and increase utility (accuracy of data mining task) of a released database. This is achieved by transforming a part of quasi-identifier and personalizing the sensitive attribute values. Our experiment conducted on the datasets from the UCI machine repository demonstrates that there is incremental gain in data mining utility while preserving the privacy to a great extend.
Keywords :
data mining; data privacy; generalisation (artificial intelligence); K-anonymous; L-diverse; UCI machine repository; external linkage problem; generalization techniques; generalized quasi identifiers; incremental gain; personal data; personalized generalization; privacy breach; privacy preserving data mining; sensitive attribute disclosure; t- closeness; Cancer; Data analysis; Data mining; Data privacy; Databases; Diseases; Hospitals; Joining processes; Publishing; Stomach; Quasi-identifiers; anonymity; fuzzy method; personalized privacy; privacy preservation;
Conference_Titel :
Advance Computing Conference, 2009. IACC 2009. IEEE International
Conference_Location :
Patiala
Print_ISBN :
978-1-4244-2927-1
Electronic_ISBN :
978-1-4244-2928-8
DOI :
10.1109/IADCC.2009.4809231