Title :
On the Lower Bound of Local Optimums in K-Means Algorithm
Author :
Zhang, Zhenjie ; Dai, Bing Tian ; Tung, Anthony K H
Author_Institution :
Sch. of Comput., Nat. Univ. of Singapore, Singapore
Abstract :
The k-means algorithm is a popular clustering method used in many different fields of computer science, such as data mining, machine learning and information retrieval. However, the k-means algorithm is very likely to converge to some local optimum which is much worse than the desired global optimal solution. To overcome this problem, current k-means algorithm and its variants usually run many times with different initial centers to avoid being trapped in local optimums that are of unacceptable quality. In this paper, we propose an efficient method to compute a lower bound on the cost of the local optimum from the current center set. After every k-means iteration, k-means algorithm can halt the procedure if the lower bound of the cost at the future local optimum is worse than the best solution that has already been computed so far. Although such a lower bound computation incurs some extra time consumption in the iterations, extensive experiments on both synthetic and real data sets show that this method can greatly prune the unnecessary iterations and improve the efficiency of the algorithm in most of the data sets, especially with high dimensionality and large k.
Keywords :
iterative methods; pattern clustering; clustering method; computer science; data mining; information retrieval; k-means iteration; local optimums; machine learning; Acceleration; Clustering algorithms; Clustering methods; Computer science; Cost function; Data mining; Euclidean distance; Information retrieval; Machine learning; Machine learning algorithms;
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2701-7
DOI :
10.1109/ICDM.2006.118