مرکز منطقه ای اطلاع رساني علوم و فناوري - An optimal distributed K-Means clustering algorithm based on cloudstack

DocumentCode :

3660556

Title :

An optimal distributed K-Means clustering algorithm based on cloudstack

Author :

Yingchi Mao;Ziyang Xu;Xiaofang Li;Ping Ping

Author_Institution :

College of Computer and Information Engineering, Hohai University, Nanjing, Jiangsu Province, China

fYear :

2015

Firstpage :

3149

Lastpage :

3156

Abstract :

Clustering algorithm is applied to all kinds of fields, especially in the field of data mining. Due to the increasing number of the data, it´s too hard for the clustering algorithm to afford the computation time in traditional computing model. When handling with big data, the corresponding algorithms of data mining have been transformed from the original single-core or single ported into the parallel and distributed processing. Parallel processing becomes the most popular way to improve the execution performance. This paper established a Hadoop distributed cluster based on the CloudStack and implemented the optimal distributed K-Means clustering algorithm based on MapReduce. The proposed optimal distributed K-Means clustering can obtain good quality of the results and the efficiency of the execution time. The experiment results show that the optimal distributed K-Means cluster algorithm can have better performance for dealing with large-scale data set.

Keywords :

"Clustering algorithms","Algorithm design and analysis","Computational modeling","Distributed databases","Complexity theory","Virtual machining","Data mining"

Publisher :

ieee

Conference_Titel :

Information and Automation, 2015 IEEE International Conference on

Type :

conf

DOI :

10.1109/ICInfA.2015.7279830

Filename :

7279830

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3660556