Title :
Enhanced cluster validity index for the evaluation of optimal number of clusters for Fuzzy C-Means algorithm
Author :
Bharill, Neha ; Tiwari, Anish
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Indore, Indore, India
Abstract :
Cluster validity index is a measure to determine the optimal number of clusters denoted by (C) and an optimal fuzzy partition for clustering algorithms. In this paper, we proposed a new cluster validity index to determine an optimal number of hyper-ellipsoid or hyper-spherical shape clusters generated by Fuzzy C-Means (FCM) algorithm called as VIDSO index. The proposed validity index jointly exploits all the three measures named as intra-cluster compactness, an inter-cluster separation and overlap between the clusters. The proposed intra-cluster compactness is based on relative variability concept which is a statistical measure of relative dispersion or scattering of data in various dimensions within the clusters. The proposed inter-cluster separation measure indicates the isolation or distance between the fuzzy clusters. The proposed inter-cluster overlap measure determines the degree of overlap between the fuzzy clusters. The best fuzzy partition produced by the VIDSO index is expected to have low degree of intra-cluster compactness, higher degree of inter-cluster separation and low degree of inter-cluster overlap. The efficacy of VIDSO index is evaluated on six benchmark data sets and compared with a number of known validity indices. The experimental results and the comparative study demonstrate that, the proposed index is highly effective and reliable in estimating the optimal value of C and an optimal fuzzy partition for each data set because, it is insensitive with change in values of fuzzification parameter denoted by m. In contrast, the other indices [2], [3], [6], [7] fails to achieve the optimal value of C due to it is susceptibility with change in m.
Keywords :
fuzzy set theory; pattern clustering; FCM; cluster number evaluation; enhanced cluster validity index; fuzzification parameter; fuzzy c-means algorithm; fuzzy cluster distance; fuzzy cluster isolation; hyper-ellipsoid shape clusters; hyper-spherical shape clusters; inter-cluster overlap; inter-cluster separation; intra-cluster compactness; optimal fuzzy partition; relative data dispersion; relative data scattering; relative variability concept; Algorithm design and analysis; Clustering algorithms; Dispersion; Fuzzy logic; Indexes; Partitioning algorithms; Shape;
Conference_Titel :
Fuzzy Systems (FUZZ-IEEE), 2014 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-2073-0
DOI :
10.1109/FUZZ-IEEE.2014.6891591