Title of article :
Performance analysis of k-means with different initialization methods for high dimensional data
Author/Authors :
Tajunisha and Saravanan، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Abstract :
Developing effective clustering method for high dimensional dataset is a challengingproblem due to the curse of dimensionality. Among all the partition based clusteringalgorithms, k-means is one of the most well known methods to partition a dataset intogroups of patterns. However, the k-means method converges to one of many localminima. And it is known that, the final result depends on the initial starting points (means). Many methods have been proposed to improve the performance of k-meansalgorithm. In this paper, we have analyzed the performance of our proposed method withthe existing works. In our proposed method, we have used Principal Component Analysis (PCA) for dimension reduction and to find the initial centroid for k-means. Next we haveused heuristics approach to reduce the number of distance calculation to assign the datapoint to cluster. By comparing the results on iris data set, it was found that the resultsobtained by the proposed method are more effective than the existing method
Keywords :
Principal component analysis , k-means , initial centroid , Dimension reduction
Journal title :
International Journal of Artificial Intelligence & Applications
Journal title :
International Journal of Artificial Intelligence & Applications