DocumentCode :
3455538
Title :
Finding the Optimal Number of Clusters from Artificial Datasets
Author :
Päivinen, Niina ; Grönfors, Tapio
Author_Institution :
Dept. of Comput. Sci., Kuopio Univ., Kuopio
fYear :
2006
fDate :
20-22 Aug. 2006
Firstpage :
1
Lastpage :
6
Abstract :
This study deals with the problem of selecting the right number of clusters. Scale-free minimum spanning trees (SFMSTs) were constructed from the artificial test datasets, and the number of clusters, based on the distribution of the edge lengths, as well as the clustering itself was obtained from the structure. As a reference, the nearest neighbor and k-means clustering methods were used, and the number of clusters was determined with the largest average silhouette width criterium. The SFMST clustering mehtod proved to be a method which is able to automatically find the optimal number of clusters from the dataset without using any user-defined parameters.
Keywords :
pattern clustering; statistical distributions; trees (mathematics); artificial dataset; edge length distribution; k-means clustering method; largest average silhouette width criterium; nearest neighbor clustering method; optimal cluster selection problem; probability distribution; scale-free minimum spanning tree; Bridges; Clustering methods; Computer science; Data analysis; Histograms; Joining processes; Nearest neighbor searches; Probability distribution; Testing; Tree graphs;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Cybernetics, 2006. ICCC 2006. IEEE International Conference on
Conference_Location :
Budapest
Print_ISBN :
1-4244-0071-6
Electronic_ISBN :
1-4244-0072-4
Type :
conf
DOI :
10.1109/ICCCYB.2006.305691
Filename :
4097652
Link To Document :
بازگشت