Title :
Automatic Data Clustering Analysis of Arbitrary Shape with K-Means and Enhanced Ant-Based Template Mechanism
Author :
Zhang, Wei ; Yang, Hen-I ; Jiang, Hsin-yi ; Chang, Carl K.
Abstract :
With the advancement of miniature sensors, wireless networking and context awareness, the importance of data-intensive computing is on the rise, with practical applications such as web categorization and data mining. One of the critical challenges in data-intensive computing is data clustering, as effective clustering algorithm will enable researchers and automated systems to analyze and organize massive amount of data much more efficiently. Many data clustering algorithms already exist, but most require a priori knowledge on the number of classes to guide the clustering process. We propose Auto_Ant_TMs_Shape, a two-phase algorithm, for automatically forming optimal number of clusters with arbitrary shapes. The first phase uses the hybrid approach of K-means and enhanced Ant-based template mechanism to generate small seed clusters with high purity in each cluster. In the second phase, small clusters are iteratively merged to obtain the final clusters using a merging algorithm. We apply Auto_Ant_TMs_Shape to 8 widely-used datasets, and compare the clustering results with two approaches based on density-based algorithm (DBSCAN) and Particle Swarm Optimization (PSO). The results show that Auto_Ant_TMs_Shape is very effective and thus achieve good clustering results in near optimal number of clusters without knowing the number of classes in advance.
Keywords :
data analysis; data mining; particle swarm optimisation; pattern clustering; Auto_Ant_TMs_Shape; DBSCAN; K-means; PSO; Web categorization; and particle swarm optimization; arbitrary shapes; automated systems; automatic data clustering analysis; context awareness; data analysis; data mining; data organization; data-intensive computing; density-based algorithm; enhanced ant-based template mechanism; merging algorithm; two-phase algorithm; wireless networking; Algorithm design and analysis; Clustering algorithms; Convergence; Density measurement; Merging; Shape; Ant-based Template Mechanism; Data Clustering;
Conference_Titel :
Computer Software and Applications Conference (COMPSAC), 2012 IEEE 36th Annual
Conference_Location :
Izmir
Print_ISBN :
978-1-4673-1990-4
Electronic_ISBN :
0730-3157
DOI :
10.1109/COMPSAC.2012.66