• DocumentCode
    3454917
  • Title

    A Hierarchical Clustering Algorithm Based on K-Means with Constraints

  • Author

    Hang, GuoYan ; Zhang, DongMei ; Ren, Jiadong ; Hu, Changzhen

  • Author_Institution
    Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
  • fYear
    2009
  • fDate
    7-9 Dec. 2009
  • Firstpage
    1479
  • Lastpage
    1482
  • Abstract
    Hierarchical clustering is one of the most important tasks in data mining. However, the existing hierarchical clustering algorithms are time-consuming, and have low clustering quality because of ignoring the constraints. In this paper, a Hierarchical Clustering Algorithm based on K-means with Constraints (HCAKC) is proposed. In HCAKC, in order to improve the clustering efficiency, Improved Silhouette is defined to determine the optimal number of clusters. In addition, to improve the hierarchical clustering quality, the existing pairwise must-link and cannot-link constraints are adopted to update the cohesion matrix between clusters. Penalty factor is introduced to modify the similarity metric to address the constraint violation. The experimental results show that HCAKC has lower computational complexity and better clustering quality compared with the existing algorithm CSM.
  • Keywords
    computational complexity; constraint handling; data mining; pattern clustering; HCAKC; clustering quality; cohesion matrix; computational complexity; constraints; data mining; hierarchical clustering algorithm; improved Silhouette; k-means; penalty factor; similarity metric; Clustering algorithms; Computational complexity; Computer science; Data analysis; Data engineering; Data mining; Educational institutions; Information science; Iterative algorithms; Partitioning algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovative Computing, Information and Control (ICICIC), 2009 Fourth International Conference on
  • Conference_Location
    Kaohsiung
  • Print_ISBN
    978-1-4244-5543-0
  • Type

    conf

  • DOI
    10.1109/ICICIC.2009.18
  • Filename
    5412270