Title :
A Hierarchical Clustering Algorithm Based on Grid Partition
Author :
Zhao, Hongbin ; Han, Qilong ; Pan, Haiwei
Author_Institution :
Coll. of Autom., Harbin Eng. Univ., Harbin, China
Abstract :
In spatial data mining, the k-means algorithm is probably the most widely applied clustering method. But a major drawback of k-means algorithm is that it is difficult to determine the parameter k to represent natural cluster, and it is only suitable for concave spherical clusters. The paper presents an efficient clustering algorithm which combines the hierarchical approach with the grid partition. The hierarchical approach is applied to find the genuine clusters by repeatedly combining together these blocks. Hilbert curve is a continuous path which passes through every point in a space between the coordinates of the points and the one-dimensional sequence numbers of the points on the curve. The goal of using Hilbert curve is to preserve the distance of that points which are close in space and represent similar data should be stored close together in the linear order. The simulation shows that the clustering algorithm can have shorter execution time than k-means algorithms for the large databases. Moreover, the algorithm can deal with clusters with arbitrary shapes in which the k-means algorithm can not discover.
Keywords :
data mining; pattern clustering; very large databases; Hilbert curve; concave spherical clusters; genuine clusters; grid partition; hierarchical clustering algorithm; k-means algorithm; large databases; one-dimensional sequence numbers; spatial data mining; Hilbert curve; clustering algorithm; grid partition; hierarchical clustering;
Conference_Titel :
Multimedia Communications (Mediacom), 2010 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-0-7695-4136-5
DOI :
10.1109/MEDIACOM.2010.46