Abstract :
With the development of sensor network, mobile computing, and web applications, data are now collected from many distributed sources to form big datasets. Such datasets can be hosted in the cloud to achieve economical processing and sharing. However, these data might be highly sensitive requiring secure storage and processing. We envision a cloud-based data storage and processing framework that enables users to economically and securely share and handle big datasets. Under this framework, we study the matrix-based data mining algorithms with a focus on the secure top-k eigenvector algorithm. Our approach uses an iterative processing model in which the authorized user interacts with the cloud to achieve the result. In this process, both the source matrix and the intermediate results keep confidential and the client-side incurs low costs. The security of this approach is guaranteed by using Paillier Encryption and a random perturbation technique. We carefully analyze its security under a cloud-specific threat model. Our experimental results show that the proposed method is scalable to big matrices while requiring low client-side costs.
Keywords :
Web services; cloud computing; cryptography; data mining; eigenvalues and eigenfunctions; matrix algebra; Paillier encryption; Web applications; authorized user; big datasets; cloud based data storage; cloud specific threat model; iterative processing model; matrix based data mining algorithms; mobile computing; processing framework; secure computation; sensor network; shared matrices; source matrix; top-k eigenvector algorithm; Distributed databases; Eigenvalues and eigenfunctions; Encryption; Sparse matrices; Vectors; MapReduce; big matrix; cloud computing; performance; power iteration; security;