DocumentCode
2915300
Title
Extending external validity measures for determining the number of clusters
Author
Zhao, Qinpei ; Xu, Mantao ; Fränti, Pasi
Author_Institution
Sch. of Comput., Univ. of Eastern Finland, Joensuu, Finland
fYear
2011
fDate
22-24 Nov. 2011
Firstpage
931
Lastpage
936
Abstract
External validity measures in cluster analysis evaluate how well the clustering results match to a prior knowledge about the data. However, it is always intractable to get the prior knowledge in the practical problem of unsupervised learning, such as cluster analysis. In this paper, we extend the external validity measures for both hard and soft partitions by a resampling method, where no prior information is needed. To lighten the time burden caused by the resampling method, we incorporate two approaches into the proposed method: (i) extending external validity measures for soft partitions in a computational time of O(M2N); (ii) an efficient sub-sampling method with time complexity of O(N). The proposed method is then applied and reviewed in determining the number of clusters for the problem of unsupervised learning, cluster analysis. Experimental results has demonstrated the proposed method is very effective in solving the number of clusters.
Keywords
computational complexity; pattern clustering; sampling methods; statistical analysis; unsupervised learning; cluster analysis; computational time; external validity measure; resampling method; soft partition; subsampling method; time complexity; unsupervised learning; Clustering algorithms; Correlation; Educational institutions; Image segmentation; Indexes; Partitioning algorithms; Time measurement; clustering; external cluster validity; image segmentation; subsampling;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on
Conference_Location
Cordoba
ISSN
2164-7143
Print_ISBN
978-1-4577-1676-8
Type
conf
DOI
10.1109/ISDA.2011.6121777
Filename
6121777
Link To Document