• DocumentCode
    870195
  • Title

    Performance evaluation of some clustering algorithms and validity indices

  • Author

    Maulik, Ujjwal ; Bandyopadhyay, Sanghamitra

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Texas Univ., Arlington, TX, USA
  • Volume
    24
  • Issue
    12
  • fYear
    2002
  • fDate
    12/1/2002 12:00:00 AM
  • Firstpage
    1650
  • Lastpage
    1654
  • Abstract
    In this article, we evaluate the performance of three clustering algorithms, hard K-Means, single linkage, and a simulated annealing (SA) based technique, in conjunction with four cluster validity indices, namely Davies-Bouldin index, Dunn´s index, Calinski-Harabasz index, and a recently developed index I. Based on a relation between the index I and the Dunn´s index, a lower bound of the value of the former is theoretically estimated in order to get unique hard K-partition when the data set has distinct substructures. The effectiveness of the different validity indices and clustering methods in automatically evolving the appropriate number of clusters is demonstrated experimentally for both artificial and real-life data sets with the number of clusters varying from two to ten. Once the appropriate number of clusters is determined, the SA-based clustering technique is used for proper partitioning of the data into the said number of clusters.
  • Keywords
    pattern classification; pattern clustering; simulated annealing; software performance evaluation; unsupervised learning; Calinski-Harabasz index; Davies-Bouldin index; Dunn index; cluster validity indices; clustering; clustering algorithms; hard K-Means; partition matrix; simulated annealing; single linkage; unsupervised classification; validity index; Clustering algorithms; Clustering methods; Couplings; Estimation theory; Euclidean distance; Partitioning algorithms; Simulated annealing; Temperature; Virtual manufacturing;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2002.1114856
  • Filename
    1114856