DocumentCode :
2350644
Title :
Optimizing PPDM in asynchronous sparse data using random projection
Author :
Kumar, Raja R. ; Indumathi, J. ; Uma, G.V.
Author_Institution :
Department of Computer Science and Engineering, College of Engineering, Anna University, Chennai ¿ 600025, Tamilnadu, India
fYear :
2008
fDate :
13-15 July 2008
Firstpage :
410
Lastpage :
415
Abstract :
Privacy is fetching a progressively more imperative issue in several data-mining applications dealing with sensitive data especially in health care, security, financial, behavioral etc., Most of the existing techniques are managing a Secure Two-Party Computation model, where two parties, each having a private database, want to cooperatively conduct data-mining operations on the union of their data. The problem we are pinning down for Privacy Preserving Data Mining(PPDM), is how a data owner can release a version of its confidential data with guarantees that the original sensitive information cannot be re-identified while the analytic properties of the data are preserved. In this paper we work to investigate the leeway of using multiplicative random projection sparse matrices for privacy preserving data in datasets which gets incremented asynchronously over time from various sources. The data stream is asynchronous. This work proposes the use of random projections with a sparse matrix to maintain a sketch of a collection of high-dimensional data-streams that are updated asynchronously. This sketch allows us to estimate L2 (Euclidean) distances and dot products with high accuracy. We have also proposed a conceptual architecture for implementing the privacy preservation techniques especially the Sparse Random Projection Matrix technique in incremental data to improve the level of privacy protection. We have tested to see that the perturbed data still preserves certain statistical characteristics of the data as the original unperturbed data. At this juncture we have proposed a generic projection based sketch for incremental data stream which can be used not only for this application but also can be used for any other applications, which supports incremental data bases. We have traced the origin of PPDM, the definition of privacy preservation in data mining, and the implications of benchmark privacy doctrine in information detection and advocate a few policies for PPDM b- ased on these privacy principles. These are vital for the development and deployment of methodological solutions. This will let vendors and developers to construct unyielding information reuse and integration (IRI) in PPDM. We pursue to capitalize on the reuse of PPDM information by crafting easy, affluent, and reusable knowledge depictions and accordingly investigates tactics for amalgamate this knowledge into heritage systems and make advances in the upcoming of PPDM.
Keywords :
Computational modeling; Data privacy; Data security; Databases; Financial management; Information analysis; Medical services; Protection; Sparse matrices; Testing; Asynchronous Data Stream; Data Sets; L2 (Euclidean) distances; PPDM_IRI; Performance; Privacy; Privacy Preserving Data Mining (PPDM); Random Projection; Secure Data Mining; Sparse Matrix;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Reuse and Integration, 2008. IRI 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV, USA
Print_ISBN :
978-1-4244-2659-1
Electronic_ISBN :
978-1-4244-2660-7
Type :
conf
DOI :
10.1109/IRI.2008.4583066
Filename :
4583066
Link To Document :
بازگشت