A Comparison Study of Cluster Validity Indices Using a Nonhierarchical Clustering Algorithm

Author

Shim, Yosung ; Chung, Jiwon ; Choi, In-Chan

Author_Institution

Dept. of Ind. Syst. & Inf. Eng., Korea Univ., Seoul

Volume

1

fYear

2005

fDate

28-30 Nov. 2005

Firstpage

199

Lastpage

204

Abstract

Cluster analysis is widely used in the initial stages of data analysis and data reduction. The K-means algorithm, a nonhierarchical clustering algorithm, has regained popularity among researchers in data mining and knowledge discovery, partly because of its low time complexity. The algorithm requires the number of clusters as an input parameter. When the parameter value is not known a priori, a researcher often has to use a cluster validity index to search for a suitable parameter value. In this study, we use computational experiments to examine the performance of cluster validity indices with the K-means algorithm. Our analysis parallels the study performed by Milligan and Cooper on cluster validity indices; we use hierarchical clustering algorithms and present observations and conclusions resulting from the simulation study

Keywords

data analysis; data mining; data reduction; pattern clustering; unsupervised learning; K-means algorithm; cluster analysis; cluster validity index; data analysis; data mining; data reduction; knowledge discovery; nonhierarchical clustering algorithm; unsupervised learning; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data analysis; Data engineering; Data mining; Educational institutions; Information analysis; Partitioning algorithms; Systems engineering and theory;

fLanguage

English

Publisher

ieee

Conference_Titel

Computational Intelligence for Modelling, Control and Automation, 2005 and International Conference on Intelligent Agents, Web Technologies and Internet Commerce, International Conference on

Conference_Location

Vienna

Print_ISBN

0-7695-2504-0

Type

conf

DOI

10.1109/CIMCA.2005.1631265

Filename

1631265