DocumentCode :
238120
Title :
Fuzzy K-mean clustering in MapReduce on cloud based hadoop
Author :
Garg, Deepak ; Trivedi, Khushbu
Author_Institution :
Dept. of Comput. Sci. & Eng., Parul Inst. of Eng. & Technol., Limda, India
fYear :
2014
fDate :
8-10 May 2014
Firstpage :
1607
Lastpage :
1610
Abstract :
Clustering is regarded as one of the significant task in data mining which deals with primarily grouping of similar data. To cluster large data is a point of concern. Hadoop is a software framework which deals with distributed processing of huge amount of data across clusters of commodity computers using MapReduce programming model. MapReduce allows a kind of parallelization for solving a problem involving large data sets using computing clusters and is also an attractive mean for data clustering involving large datasets. Mahout, a scalable machine learning library is an approach to Fuzzy K-mean clustering which runs on a Hadoop. This paper focuses on studying the performance of different datasets using Fuzzy K-mean clustering in MapReduce on Hadoop. Experimental results depict the execution time of the approach on a multi-node Hadoop cluster which is build using Amazon Elastic Cloud Computing(Amazon EC2).
Keywords :
cloud computing; data handling; learning (artificial intelligence); parallel programming; pattern clustering; Amazon EC2; Amazon Elastic Cloud Computing; Mahout; MapReduce programming model; data clustering; distributed processing; fuzzy k-mean clustering; machine learning library; multinode Hadoop cluster; Clustering algorithms; Computers; Conferences; Data mining; Iris; Java; Vectors; Fuzzy K-mean clustering; HDFS; Hadoop; Mahout; MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Communication Control and Computing Technologies (ICACCCT), 2014 International Conference on
Conference_Location :
Ramanathapuram
Print_ISBN :
978-1-4799-3913-8
Type :
conf
DOI :
10.1109/ICACCCT.2014.7019379
Filename :
7019379
Link To Document :
بازگشت