مرکز منطقه ای اطلاع رساني علوم و فناوري - A Simple and Fast Algorithm for Global K-means Clustering

DocumentCode :

2303377

Title :

A Simple and Fast Algorithm for Global K-means Clustering

Author :

Xie, Juanying ; Jiang, Shuai

Author_Institution :

Sch. of Comput. Sci., Shaanxi Normal Univ., Xi´´an, China

Volume :

fYear :

2010

fDate :

6-7 March 2010

Firstpage :

Lastpage :

Abstract :

K-means clustering is a popular clustering algorithm based on the partition of data. However, there are some shortcomings of it, such as its requiring a user to give out the number of clusters at first, and its sensitiveness to initial conditions, and its easily getting to the trap of a local solution et cetera. The global K-means algorithm proposed by Likas et al is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) runs of the K-means algorithm from suitable initial positions. It avoids the depending on any initial conditions or parameters, and considerably outperforms the K-means algorithms, but it has a heavy computational load. In this paper, we propose a new version of the global K-means algorithm. The outstanding feature of our algorithm is its superiority in execution time. It takes less run time than that of the available global K-means algorithms. This great advantage is due to that we improved the way of creating the next cluster center in the global K-means algorithm. We defined a novel function to select the optimal candidate center for the next cluster enlightened by the idea of K-medoids clustering algorithm suggested by Park and Jun. Experiments on some well-known data sets from UCI and on a synthetic data set show that our new algorithm can significantly reduce the computational time without affecting the performance of the global K-means algorithm. The further experiments demonstrate that our improved algorithm outperforms the global K-means algorithm greatly.

Keywords :

pattern clustering; fast algorithm; global K-means clustering; global search procedure; simple algorithm; Clustering algorithms; Computer science; Computer science education; Data mining; Educational technology; Machine learning; Machine learning algorithms; Optimization methods; Partitioning algorithms; Pattern recognition; K-means clustering; clustering; data mining; global K-means clustering; machine learning; non-smooth optimization; pattern recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Education Technology and Computer Science (ETCS), 2010 Second International Workshop on

Conference_Location :

Wuhan

Print_ISBN :

978-1-4244-6388-6

Electronic_ISBN :

978-1-4244-6389-3

Type :

conf

DOI :

10.1109/ETCS.2010.347

Filename :

5460041

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2303377