Title :
Time and space efficient spectral clustering via column sampling
Author :
Li, Mu ; Lian, Xiao-Chen ; Kwok, James T. ; Lu, Bao-Liang
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
Abstract :
Spectral clustering is an elegant and powerful approach for clustering. However, the underlying eigen-decomposition takes cubic time and quadratic space w.r.t. the data set size. These can be reduced by the Nyström method which samples only a subset of columns from the matrix. However, the manipulation and storage of these sampled columns can still be expensive when the data set is large. In this paper, we propose a time- and space-efficient spectral clustering algorithm which can scale to very large data sets. A general procedure to orthogonalize the approximated eigenvectors is also proposed. Extensive spectral clustering experiments on a number of data sets, ranging in size from a few thousands to several millions, demonstrate the accuracy and scalability of the proposed approach. We further apply it to the task of image segmentation. For images with more than 10 millions pixels, this algorithm can obtain the eigenvectors in 1 minute on a single machine.
Keywords :
approximation theory; eigenvalues and eigenfunctions; image sampling; image segmentation; pattern clustering; Nyström method; column sampling; eigenvectors; image segmentation; space efficient spectral clustering algorithm; time efficient spectral clustering algorithm; Algorithm design and analysis; Approximation algorithms; Approximation methods; Clustering algorithms; Complexity theory; Eigenvalues and eigenfunctions; Laplace equations;
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on
Conference_Location :
Providence, RI
Print_ISBN :
978-1-4577-0394-2
DOI :
10.1109/CVPR.2011.5995425