Title :
Randomized algorithm with constant approximation for k-means based on the least cluster size
Author :
Wang, Shouqiang ; Zhu, Daming ; Zhang, Sheng
Author_Institution :
Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan
Abstract :
The k-means clustering is one of the most popular schemes to solve the problem of clustering. This paper investigates the approximate algorithm for the k-means clustering by means of selecting the k initial points used as centers from the original point set. It is proved that an expected 2-approximation factor can be obtained, if k centers belong to one of the optimal sub cluster points respectively. To find these k points, a randomized algorithm is proposed which obtain an expected 2-approximation factor with high probability. This algorithm selects some points from the original points to be used as candidate centers, and the size of the sample is based on having at least points of each cluster. At last, some-examples are selected to verify our algorithm and get good results.
Keywords :
approximation theory; pattern clustering; approximate algorithm; constant approximation; expected 2-approximation factor; k-means clustering; randomized algorithm; Approximation algorithms; Automation; Clustering algorithms; Computer science; Intelligent control; algorithm; clustering; k-means; randomized algorithm;
Conference_Titel :
Intelligent Control and Automation, 2008. WCICA 2008. 7th World Congress on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-2113-8
Electronic_ISBN :
978-1-4244-2114-5
DOI :
10.1109/WCICA.2008.4592800