• DocumentCode
    1641147
  • Title

    An Approximate Algorithm for k-Means Problem Based on Input Points

  • Author

    Shouqiang, Wang ; Daming, Zhu ; Shiying, Shi

  • Author_Institution
    Shandong Univ., Jinan
  • fYear
    2007
  • Firstpage
    500
  • Lastpage
    504
  • Abstract
    The k-means clustering is one of the most popular schemes for discovering clusters in data. Its aim is to minimize the mean squared distance from each data to its nearest center. A lot of variants of Lloyd´s heuristic have been studied. Unfortunately, many research results haven´t given any approximate ratio. In this paper, an algorithm is presented which can obtain the optimal clustering with the ration of at most 2. The main idea of this algorithm is that k points are selected by means of a very simple, randomized seeding technique and then the local search is implemented to improve the accuracy. Some examples are selected to verify our algorithm and got better results both the speed and accuracy than the former methods. The main algorithmic contribution is that the input points are used as the candidate sets to obtain the optimal clustering with a constant ratio by means of local search technique and one method of selecting initial points.
  • Keywords
    data analysis; mean square error methods; pattern clustering; search problems; Lloyd´s heuristic; algorithmic contribution; approximate algorithm; data clusters; k-means clustering; k-means problem; local search technique; mean squared distance; optimal clustering; randomized seeding technique; Clustering algorithms; Computer science; Data engineering; Centroid; Clustering; k-Means; ¿-separated;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control Conference, 2007. CCC 2007. Chinese
  • Conference_Location
    Hunan
  • Print_ISBN
    978-7-81124-055-9
  • Electronic_ISBN
    978-7-900719-22-5
  • Type

    conf

  • DOI
    10.1109/CHICC.2006.4346918
  • Filename
    4346918