DocumentCode :
1978631
Title :
The performance evaluation of k-means by two MapReduce frameworks, Hadoop vs. Twister
Author :
Yunhee Kang ; Park, Young B.
Author_Institution :
Div. of Inf. & Commun., Baekseok Univ., Cheonan, South Korea
fYear :
2015
fDate :
12-14 Jan. 2015
Firstpage :
405
Lastpage :
406
Abstract :
In data mining, k-means is a method of cluster analysis using the nearest mean. It has been successfully used in various topics, ranging from market segmentation, computer vision, geostatistics, and astronomy to agriculture. But k-means like clustering is not easy to apply MapReduce model due to the iterative manner that can happen the stagger map tasks with high likelihood. This paper presents the result of performance evaluation of K-means application running on Twister and Hadoop framework. We report how to design a MapReduce application to organize the objects of dataset into k partitions. This approach provides the way to cluster a dataset by Hadoop, the MapReduce frameworks in a parallel manner.
Keywords :
data handling; data mining; parallel programming; pattern clustering; Hadoop framework; MapReduce framework; Twister framework; cluster analysis; data mining; k-means performance evaluation; k-partitions; nearest mean; Computational modeling; Data mining; Distributed databases; Educational institutions; Performance evaluation; Programming; Runtime; BLAST; Emulab; MapReduce; mr-mpi;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Networking (ICOIN), 2015 International Conference on
Conference_Location :
Cambodia
Type :
conf
DOI :
10.1109/ICOIN.2015.7057927
Filename :
7057927
Link To Document :
بازگشت