• DocumentCode
    188217
  • Title

    Data Clustering with Cluster Size Constraints Using a Modified K-Means Algorithm

  • Author

    Ganganath, Nuwan ; Chi-Tsun Cheng ; Tse, Chi K.

  • Author_Institution
    Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Kowloon, China
  • fYear
    2014
  • fDate
    13-15 Oct. 2014
  • Firstpage
    158
  • Lastpage
    161
  • Abstract
    Data clustering is a frequently used technique in finance, computer science, and engineering. In most of the applications, cluster sizes are either constrained to particular values or available as prior knowledge. Unfortunately, traditional clustering methods cannot impose constrains on cluster sizes. In this paper, we propose some vital modifications to the standard k-means algorithm such that it can incorporate size constraints for each cluster separately. The modified k-means algorithm can be used to obtain clusters in preferred sizes. A potential application would be obtaining clusters with equal cluster size. Moreover, the modified algorithm makes use of prior knowledge of the given data set for selectively initializing the cluster centroids which helps escaping from local minima. Simulation results on multidimensional data demonstrate that the k-means algorithm with the proposed modifications can fulfill cluster size constraints and lead to more accurate and robust results.
  • Keywords
    pattern clustering; cluster centroids; cluster size constraints; data clustering; data set; modified k-means algorithm; multidimensional data; Algorithm design and analysis; Clustering algorithms; Clustering methods; Data models; Partitioning algorithms; Simulation; Standards; constrained clustering; data clustering; data mining; k-means; size constraints;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2014 International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4799-6235-8
  • Type

    conf

  • DOI
    10.1109/CyberC.2014.36
  • Filename
    6984299