DocumentCode
3454917
Title
A Hierarchical Clustering Algorithm Based on K-Means with Constraints
Author
Hang, GuoYan ; Zhang, DongMei ; Ren, Jiadong ; Hu, Changzhen
Author_Institution
Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
fYear
2009
fDate
7-9 Dec. 2009
Firstpage
1479
Lastpage
1482
Abstract
Hierarchical clustering is one of the most important tasks in data mining. However, the existing hierarchical clustering algorithms are time-consuming, and have low clustering quality because of ignoring the constraints. In this paper, a Hierarchical Clustering Algorithm based on K-means with Constraints (HCAKC) is proposed. In HCAKC, in order to improve the clustering efficiency, Improved Silhouette is defined to determine the optimal number of clusters. In addition, to improve the hierarchical clustering quality, the existing pairwise must-link and cannot-link constraints are adopted to update the cohesion matrix between clusters. Penalty factor is introduced to modify the similarity metric to address the constraint violation. The experimental results show that HCAKC has lower computational complexity and better clustering quality compared with the existing algorithm CSM.
Keywords
computational complexity; constraint handling; data mining; pattern clustering; HCAKC; clustering quality; cohesion matrix; computational complexity; constraints; data mining; hierarchical clustering algorithm; improved Silhouette; k-means; penalty factor; similarity metric; Clustering algorithms; Computational complexity; Computer science; Data analysis; Data engineering; Data mining; Educational institutions; Information science; Iterative algorithms; Partitioning algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Innovative Computing, Information and Control (ICICIC), 2009 Fourth International Conference on
Conference_Location
Kaohsiung
Print_ISBN
978-1-4244-5543-0
Type
conf
DOI
10.1109/ICICIC.2009.18
Filename
5412270
Link To Document