Title :
Speedup of fuzzy and possibilistic kernel c-means for large-scale clustering
Author :
Havens, Timothy C. ; Chitta, Radha ; Jain, Anil K. ; Jin, Rong
Author_Institution :
Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
Abstract :
The ubiquity of personal computing technology has produced an abundance of staggeringly large data sets-the Library of Congress has stored over 160 terabytes of web data and it is estimated that Facebook alone logs over 25 terabytes of data per day. There is a great need for systems by which one can elucidate the similarity and dissimilarity among and between groups in these data sets. Clustering is one way to find these groups. In this paper, we propose an approximation method for the fuzzy and possibilistic kernel c-means clustering algorithms. Our approximation constrains the cluster centers to be linear combinations of a size m randomly selected subset of the n input objects, where m ≪ n. The proposed algorithm only requires an m × n rectangular portion of the full n × n kernel matrix and the n diagonal values, resulting in significant memory savings. Furthermore, the computational complexity of the c-means algorithm is substantially reduced. We demonstrate that up to 3 orders of magnitude of speedup are possible while achieving almost the same performance as the original kernel c-means algorithm.
Keywords :
approximation theory; computational complexity; fuzzy set theory; pattern clustering; ubiquitous computing; Facebook; Web data; approximation method; computational complexity; fuzzy kernel c-means clustering algorithms; large scale clustering; library of congress; personal computing technology ubiquity; possibilistic kernel c-means clustering algorithms; Approximation algorithms; Approximation methods; Clustering algorithms; Kernel; Memory management; Partitioning algorithms; Phase change materials; clustering; fuzzy partitions; kernel methods; large-scale data; possibilistic partitions;
Conference_Titel :
Fuzzy Systems (FUZZ), 2011 IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-7315-1
Electronic_ISBN :
1098-7584
DOI :
10.1109/FUZZY.2011.6007618