DocumentCode :
2985472
Title :
Efficient Kernel Clustering Using Random Fourier Features
Author :
Chitta, R. ; Rong Jin ; Jain, Anubhav K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
fYear :
2012
fDate :
10-13 Dec. 2012
Firstpage :
161
Lastpage :
170
Abstract :
Kernel clustering algorithms have the ability to capture the non-linear structure inherent in many real world data sets and thereby, achieve better clustering performance than Euclidean distance based clustering algorithms. However, their quadratic computational complexity renders them non-scalable to large data sets. In this paper, we employ random Fourier maps, originally proposed for large scale classification, to accelerate kernel clustering. The key idea behind the use of random Fourier maps for clustering is to project the data into a low-dimensional space where the inner product of the transformed data points approximates the kernel similarity between them. An efficient linear clustering algorithm can then be applied to the points in the transformed space. We also propose an improved scheme which uses the top singular vectors of the transformed data matrix to perform clustering, and yields a better approximation of kernel clustering under appropriate conditions. Our empirical studies demonstrate that the proposed schemes can be efficiently applied to large data sets containing millions of data points, while achieving accuracy similar to that achieved by state-of-the-art kernel clustering algorithms.
Keywords :
Fourier transforms; computational complexity; pattern classification; pattern clustering; Euclidean distance based clustering algorithms; kernel clustering algorithms; large scale classification; nonlinear structure; quadratic computational complexity; random Fourier features; random Fourier maps; real world data sets; transformed data matrix; Accuracy; Approximation algorithms; Approximation methods; Clustering algorithms; Complexity theory; Kernel; Vectors; Kernel clustering; Kernel k-means; Random Fourier features; Scalability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2012 IEEE 12th International Conference on
Conference_Location :
Brussels
ISSN :
1550-4786
Print_ISBN :
978-1-4673-4649-8
Type :
conf
DOI :
10.1109/ICDM.2012.61
Filename :
6413906
Link To Document :
بازگشت