Title :
A Simple and Fast Algorithm for Global K-means Clustering
Author :
Xie, Juanying ; Jiang, Shuai
Author_Institution :
Sch. of Comput. Sci., Shaanxi Normal Univ., Xi´´an, China
Abstract :
K-means clustering is a popular clustering algorithm based on the partition of data. However, there are some shortcomings of it, such as its requiring a user to give out the number of clusters at first, and its sensitiveness to initial conditions, and its easily getting to the trap of a local solution et cetera. The global K-means algorithm proposed by Likas et al is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) runs of the K-means algorithm from suitable initial positions. It avoids the depending on any initial conditions or parameters, and considerably outperforms the K-means algorithms, but it has a heavy computational load. In this paper, we propose a new version of the global K-means algorithm. The outstanding feature of our algorithm is its superiority in execution time. It takes less run time than that of the available global K-means algorithms. This great advantage is due to that we improved the way of creating the next cluster center in the global K-means algorithm. We defined a novel function to select the optimal candidate center for the next cluster enlightened by the idea of K-medoids clustering algorithm suggested by Park and Jun. Experiments on some well-known data sets from UCI and on a synthetic data set show that our new algorithm can significantly reduce the computational time without affecting the performance of the global K-means algorithm. The further experiments demonstrate that our improved algorithm outperforms the global K-means algorithm greatly.
Keywords :
pattern clustering; fast algorithm; global K-means clustering; global search procedure; simple algorithm; Clustering algorithms; Computer science; Computer science education; Data mining; Educational technology; Machine learning; Machine learning algorithms; Optimization methods; Partitioning algorithms; Pattern recognition; K-means clustering; clustering; data mining; global K-means clustering; machine learning; non-smooth optimization; pattern recognition;
Conference_Titel :
Education Technology and Computer Science (ETCS), 2010 Second International Workshop on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-6388-6
Electronic_ISBN :
978-1-4244-6389-3
DOI :
10.1109/ETCS.2010.347