Title :
An Efficient K-means Clustering Algorithm Based on Influence Factors
Author :
Leng, Mingwei ; Tang, Haitao ; Chen, Xiaoyun
Author_Institution :
Shangrao Normal Coll., Shangrao
fDate :
July 30 2007-Aug. 1 2007
Abstract :
Clustering has been one of the most widely studied topics in data mining and pattern recognition, k-means clustering has been one of the popular, simple and faster clustering algorithms, but the right value of k is unknown and selecting effectively initial points is also difficult.In view of this, a lot of work has been done on various versions of k-means,which refines initial points and detects the number of clusters. In this paper, we present a new algorithm, called an efficient k-means clustering based on influence factors,which is divided into two stages and can automatically achieve the actual value of k and select the right initial points based on the datasets characters. Propose influence factor to measure similarity of two clusters,using it to determine whether the two clusers should be merged into one.In order to obtain a faster algorithms theorem is proposed and proofed,using it to accelerate the algorithm. Experimental results from Gaussian datasets were generated as in Pelleg and Moore (2000) show the algorithm has high quality and obtains a satisfying result.
Keywords :
Gaussian processes; data mining; pattern clustering; Gaussian dataset; data mining; efficient k-means clustering algorithm; influence factor; pattern recognition; similarity measure; Acceleration; Artificial intelligence; Clustering algorithms; Distributed computing; Merging; Partitioning algorithms; Pattern recognition; Sampling methods; Software algorithms; Software engineering; clustering; influence factor; initial points; k-means;
Conference_Titel :
Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2007. SNPD 2007. Eighth ACIS International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-0-7695-2909-7
DOI :
10.1109/SNPD.2007.279