DocumentCode :
3717343
Title :
Multi-probe random projection clustering to secure very large distributed datasets
Author :
Lee A. Carraher;Philip A. Wilsey;Anindya Moitra;Sayantan Dey
Author_Institution :
University of Cincinnati, Cincinnati, OH 45221-0030
fYear :
2015
Firstpage :
1891
Lastpage :
1900
Abstract :
This paper presents a solution to the approximate k-means clustering problem for very large distributed datasets. Distributed data models have gained popularity in recent years following the efforts of commercial, academic and government organizations, to make data more widely accessible. Due to the sheer volume of available data, in-memory single-core computation quickly becomes infeasible, requiring distributed multiprocessing. Our solution achieves comparable clustering performance to other popular clustering algorithms, with improved overall complexity growth while being amenable to distributed processing frameworks such as Map-Reduce. Our solution also maintains certain guarantees regarding data privacy deanonimization.
Keywords :
"Clustering algorithms","Lattices","Approximation algorithms","Distributed databases","Algorithm design and analysis","Partitioning algorithms","Complexity theory"
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/BigData.2015.7363964
Filename :
7363964
Link To Document :
بازگشت