Title :
Finding the Optimal Number of Clusters from Artificial Datasets
Author :
Päivinen, Niina ; Grönfors, Tapio
Author_Institution :
Dept. of Comput. Sci., Kuopio Univ., Kuopio
Abstract :
This study deals with the problem of selecting the right number of clusters. Scale-free minimum spanning trees (SFMSTs) were constructed from the artificial test datasets, and the number of clusters, based on the distribution of the edge lengths, as well as the clustering itself was obtained from the structure. As a reference, the nearest neighbor and k-means clustering methods were used, and the number of clusters was determined with the largest average silhouette width criterium. The SFMST clustering mehtod proved to be a method which is able to automatically find the optimal number of clusters from the dataset without using any user-defined parameters.
Keywords :
pattern clustering; statistical distributions; trees (mathematics); artificial dataset; edge length distribution; k-means clustering method; largest average silhouette width criterium; nearest neighbor clustering method; optimal cluster selection problem; probability distribution; scale-free minimum spanning tree; Bridges; Clustering methods; Computer science; Data analysis; Histograms; Joining processes; Nearest neighbor searches; Probability distribution; Testing; Tree graphs;
Conference_Titel :
Computational Cybernetics, 2006. ICCC 2006. IEEE International Conference on
Conference_Location :
Budapest
Print_ISBN :
1-4244-0071-6
Electronic_ISBN :
1-4244-0072-4
DOI :
10.1109/ICCCYB.2006.305691