DocumentCode :
779491
Title :
An efficient k-means clustering algorithm: analysis and implementation
Author :
Kanungo, Tapas ; Mount, David M. ; Netanyahu, Nathan S. ; Piatko, Christine D. ; Silverman, Ruth ; Wu, Angela Y.
Author_Institution :
Almaden Res. Center, San Jose, CA, USA
Volume :
24
Issue :
7
fYear :
2002
fDate :
7/1/2002 12:00:00 AM
Firstpage :
881
Lastpage :
892
Abstract :
In k-means clustering, we are given a set of n data points in d-dimensional space Rd and an integer k and the problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. A popular heuristic for k-means clustering is Lloyd´s (1982) algorithm. We present a simple and efficient implementation of Lloyd´s k-means clustering algorithm, which we call the filtering algorithm. This algorithm is easy to implement, requiring a kd-tree as the only major data structure. We establish the practical efficiency of the filtering algorithm in two ways. First, we present a data-sensitive analysis of the algorithm´s running time, which shows that the algorithm runs faster as the separation between clusters increases. Second, we present a number of empirical studies both on synthetically generated data and on real data sets from applications in color quantization, data compression, and image segmentation
Keywords :
covariance matrices; filtering theory; pattern clustering; Lloyd algorithm; color quantization; data compression; data structure; data-sensitive analysis; filtering algorithm; image segmentation; k-means clustering algorithm; kd-tree; mean squared distance; Algorithm design and analysis; Clustering algorithms; Data analysis; Data compression; Data mining; Data structures; Filtering algorithms; Iterative algorithms; Machine learning algorithms; Pattern recognition;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/TPAMI.2002.1017616
Filename :
1017616
Link To Document :
بازگشت