Title :
An Improved Initialization Method for Clustering High-Dimensional Data
Author :
Zhang, Yanping ; Jiang, Qingshan
Author_Institution :
Software Sch., Xiamen Univ., Xiamen, China
Abstract :
Searching initial centers in high dimensional space is an interesting and important problem which is relevant for the wide various types of K-Means algorithm. However, this is a very difficult problem, due to the"curse of dimensionality"and the inherently sparse data.Algorithm IMSND is one of the latest initialization methods that are based on the idea of sharing neighborhood density. Concerning the accuracy and the input parameters of IMSND, an optimized algorithm is presented, which employs a new density measure with distance weight coefficient to improve the search accuracy. Experimental results on real world datasets show that our algorithm outperforms other algorithms, including IMSND.
Keywords :
data mining; pattern clustering; search problems; data clustering; distance weight coefficient; initialization method; k-means algorithm; optimized algorithm; Accuracy; Algorithm design and analysis; Clustering algorithms; Density measurement; Noise; Software algorithms; Weight measurement;
Conference_Titel :
Database Technology and Applications (DBTA), 2010 2nd International Workshop on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-6975-8
Electronic_ISBN :
978-1-4244-6977-2
DOI :
10.1109/DBTA.2010.5659001