DocumentCode
1967944
Title
Automatic Data Clustering Analysis of Arbitrary Shape with K-Means and Enhanced Ant-Based Template Mechanism
Author
Zhang, Wei ; Yang, Hen-I ; Jiang, Hsin-yi ; Chang, Carl K.
fYear
2012
fDate
16-20 July 2012
Firstpage
452
Lastpage
460
Abstract
With the advancement of miniature sensors, wireless networking and context awareness, the importance of data-intensive computing is on the rise, with practical applications such as web categorization and data mining. One of the critical challenges in data-intensive computing is data clustering, as effective clustering algorithm will enable researchers and automated systems to analyze and organize massive amount of data much more efficiently. Many data clustering algorithms already exist, but most require a priori knowledge on the number of classes to guide the clustering process. We propose Auto_Ant_TMs_Shape, a two-phase algorithm, for automatically forming optimal number of clusters with arbitrary shapes. The first phase uses the hybrid approach of K-means and enhanced Ant-based template mechanism to generate small seed clusters with high purity in each cluster. In the second phase, small clusters are iteratively merged to obtain the final clusters using a merging algorithm. We apply Auto_Ant_TMs_Shape to 8 widely-used datasets, and compare the clustering results with two approaches based on density-based algorithm (DBSCAN) and Particle Swarm Optimization (PSO). The results show that Auto_Ant_TMs_Shape is very effective and thus achieve good clustering results in near optimal number of clusters without knowing the number of classes in advance.
Keywords
data analysis; data mining; particle swarm optimisation; pattern clustering; Auto_Ant_TMs_Shape; DBSCAN; K-means; PSO; Web categorization; and particle swarm optimization; arbitrary shapes; automated systems; automatic data clustering analysis; context awareness; data analysis; data mining; data organization; data-intensive computing; density-based algorithm; enhanced ant-based template mechanism; merging algorithm; two-phase algorithm; wireless networking; Algorithm design and analysis; Clustering algorithms; Convergence; Density measurement; Merging; Shape; Ant-based Template Mechanism; Data Clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Software and Applications Conference (COMPSAC), 2012 IEEE 36th Annual
Conference_Location
Izmir
ISSN
0730-3157
Print_ISBN
978-1-4673-1990-4
Electronic_ISBN
0730-3157
Type
conf
DOI
10.1109/COMPSAC.2012.66
Filename
6340196
Link To Document