A MapReduce framework to implement enhanced K-means algorithm

Author

Rajashree Shettar;Bhimasen. V. Purohit

Author_Institution

Dept. of Computer Science and Engg., R.V. College of Engineering, Bengaluru, India

fYear

2015

Firstpage

361

Lastpage

363

Abstract

Data clustering forms a major part of an important aspect of big data analytics. Data Clustering helps to categorize the data, which further leads to recognize hidden patterns. K-means is one such clustering algorithm which is well known for its simple computation and also the capability of being executed in parallel. Big data analytics requires distributed computing which can be achieved using MapReduce technique. In this paper, enhanced K-means algorithm has been implemented using MapReduce technique which comes with Hadoop platform. The enhanced K-means algorithm is efficient compared to traditional K-means algorithm as it selects the initial centroids of cluster by averaging the data points, rather than random selection of centroids for initial computations as being done in traditional K-means algorithm. The enhanced K-means algorithm achieves better accuracy in cluster formation than traditional K-means.

Keywords

"Clustering algorithms","Algorithm design and analysis","Data mining","Big data","Computer science","Arrays"

Publisher

ieee

Conference_Titel

Applied and Theoretical Computing and Communication Technology (iCATccT), 2015 International Conference on

Type

conf

DOI

10.1109/ICATCCT.2015.7456910

Filename

7456910

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3770043