• DocumentCode
    2303377
  • Title

    A Simple and Fast Algorithm for Global K-means Clustering

  • Author

    Xie, Juanying ; Jiang, Shuai

  • Author_Institution
    Sch. of Comput. Sci., Shaanxi Normal Univ., Xi´´an, China
  • Volume
    2
  • fYear
    2010
  • fDate
    6-7 March 2010
  • Firstpage
    36
  • Lastpage
    40
  • Abstract
    K-means clustering is a popular clustering algorithm based on the partition of data. However, there are some shortcomings of it, such as its requiring a user to give out the number of clusters at first, and its sensitiveness to initial conditions, and its easily getting to the trap of a local solution et cetera. The global K-means algorithm proposed by Likas et al is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) runs of the K-means algorithm from suitable initial positions. It avoids the depending on any initial conditions or parameters, and considerably outperforms the K-means algorithms, but it has a heavy computational load. In this paper, we propose a new version of the global K-means algorithm. The outstanding feature of our algorithm is its superiority in execution time. It takes less run time than that of the available global K-means algorithms. This great advantage is due to that we improved the way of creating the next cluster center in the global K-means algorithm. We defined a novel function to select the optimal candidate center for the next cluster enlightened by the idea of K-medoids clustering algorithm suggested by Park and Jun. Experiments on some well-known data sets from UCI and on a synthetic data set show that our new algorithm can significantly reduce the computational time without affecting the performance of the global K-means algorithm. The further experiments demonstrate that our improved algorithm outperforms the global K-means algorithm greatly.
  • Keywords
    pattern clustering; fast algorithm; global K-means clustering; global search procedure; simple algorithm; Clustering algorithms; Computer science; Computer science education; Data mining; Educational technology; Machine learning; Machine learning algorithms; Optimization methods; Partitioning algorithms; Pattern recognition; K-means clustering; clustering; data mining; global K-means clustering; machine learning; non-smooth optimization; pattern recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Education Technology and Computer Science (ETCS), 2010 Second International Workshop on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-6388-6
  • Electronic_ISBN
    978-1-4244-6389-3
  • Type

    conf

  • DOI
    10.1109/ETCS.2010.347
  • Filename
    5460041