DocumentCode :
3717290
Title :
High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm
Author :
Jeyhun Karimov;Murat Ozbayoglu
Author_Institution :
Computer Engineering Department, TOBB University of Economics and Technology, Ankara, Turkey
fYear :
2015
Firstpage :
1473
Lastpage :
1478
Abstract :
Achieving high quality clustering is one of the most well-known problems in data mining. k-means is by far the most commonly used clustering algorithm. It converges fairly quickly, but achieving a good solution is not guaranteed. The clustering quality is highly dependent on the selection of the initial centroid selections. Moreover, when the number of clusters increases, it starts to suffer from "empty clustering". The motivation in this study is two-fold. We not only aim at improving the k-means clustering quality, but at the same time not being effected by the empty cluster issue. For achieving this purpose, we developed a hybrid model, H(EC)2S, Hybrid Evolutionary Clustering with Empty Clustering Solution. Firstly, it selects representative points to eliminate Empty Clustering problem. Then, the hybrid algorithm uses only these points during centroid selection. The proposed model combines Fireworks and Cuckoo-search based evolutionary algorithm with some centroid-calculation heuristics. The model is implemented using a Hadoop Mapreduce algorithm for achieving scalability when faced with a Big Data clustering problem. The advantages of the developed model is particularly attractive when the amount, dimensionality and number of cluster parameters tend to increase. The results indicate that considerable clustering quality performance improvement is achieved using the proposed model.
Keywords :
"Clustering algorithms","Mathematical model","Big data","Explosions","Sparks","Evolutionary computation","Arrays"
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/BigData.2015.7363909
Filename :
7363909
Link To Document :
بازگشت