DocumentCode
3580564
Title
A Time Based Analysis of Data Processing on Hadoop Cluster
Author
Pal, Amrit ; Agrawal, Sanjay
Author_Institution
Dept. of Comput. Eng. & Applic., Nat. Inst. of Tech. Teachers´ Training & Res. Bhopal, Bhopal, India
fYear
2014
Firstpage
608
Lastpage
612
Abstract
Data when it becomes in that much amount that it cannot be managed by the traditional database management system then it is Big data. It is difficult to manage this much amount of the data. Hadoop is a technological answer to the Big Data. Data storage and retrieval of information from the data is done by the Hadoop Distributed File System and the Map Reduce Programming model. MapReduce provides effective bench marks for retrieving the information from the Big Data. In this paper we present our experimental work done on the Hadoop Cluster. We have analyzed the time required by the cluster for processing the data with increasing number of nodes into the cluster. We started with a single node and then increase the node by one each time. We have analyzed three types of time. The real time, user time, system time is analyzed.
Keywords
Big Data; information retrieval; storage management; Big Data; Hadoop cluster; Hadoop distributed file system; MapReduce programming model; data processing; data storage; information retrieval; real time; system time; time based analysis; user time; Big data; Distributed databases; File systems; Google; Real-time systems; Sorting; Data Node; Hadoop Distributed File System; Job Tracker; MapReduce; Name Node; Task Tracker;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Communication Networks (CICN), 2014 International Conference on
Print_ISBN
978-1-4799-6928-9
Type
conf
DOI
10.1109/CICN.2014.136
Filename
7065556
Link To Document