Title :
A Comparison Study of Validity Indices on Swarm-Intelligence-Based Clustering
Author :
Xu, Rui ; Xu, Jie ; Wunsch, Donald C., II
Author_Institution :
Machine Learning Lab., GE Global Res., Niskayuna, NY, USA
Abstract :
Swarm intelligence has emerged as a worthwhile class of clustering methods due to its convenient implementation, parallel capability, ability to avoid local minima, and other advantages. In such applications, clustering validity indices usually operate as fitness functions to evaluate the qualities of the obtained clusters. However, as the validity indices are usually data dependent and are designed to address certain types of data, the selection of different indices as the fitness functions may critically affect cluster quality. Here, we compare the performances of eight well-known and widely used clustering validity indices, namely, the Caliński-Harabasz index, the CS index, the Davies-Bouldin index, the Dunn index with two of its generalized versions, the I index, and the silhouette statistic index, on both synthetic and real data sets in the framework of differential-evolution-particle-swarm-optimization (DEPSO)-based clustering. DEPSO is a hybrid evolutionary algorithm of the stochastic optimization approach (differential evolution) and the swarm intelligence method (particle swarm optimization) that further increases the search capability and achieves higher flexibility in exploring the problem space. According to the experimental results, we find that the silhouette statistic index stands out in most of the data sets that we examined. Meanwhile, we suggest that users reach their conclusions not just based on only one index, but after considering the results of several indices to achieve reliable clustering structures.
Keywords :
evolutionary computation; particle swarm optimisation; pattern clustering; search problems; stochastic processes; CS index; Calinski-Harabasz index; DEPSO; Davies-Bouldin index; Dunn index; I index; cluster quality; clustering validity indices; differential evolution-particle swarm optimization-based clustering; fitness functions; hybrid evolutionary algorithm; index selection; performance comparison; real data sets; search capability; silhouette statistic index; stochastic optimization approach; swarm intelligence-based clustering; synthetic data sets; Clustering algorithms; Context; Encoding; Indexes; Particle swarm optimization; Partitioning algorithms; Vectors; Clustering; differential evolution (DE); particle swarm optimization (PSO); swarm intelligence; validity index;
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
DOI :
10.1109/TSMCB.2012.2188509