• DocumentCode
    3580564
  • Title

    A Time Based Analysis of Data Processing on Hadoop Cluster

  • Author

    Pal, Amrit ; Agrawal, Sanjay

  • Author_Institution
    Dept. of Comput. Eng. & Applic., Nat. Inst. of Tech. Teachers´ Training & Res. Bhopal, Bhopal, India
  • fYear
    2014
  • Firstpage
    608
  • Lastpage
    612
  • Abstract
    Data when it becomes in that much amount that it cannot be managed by the traditional database management system then it is Big data. It is difficult to manage this much amount of the data. Hadoop is a technological answer to the Big Data. Data storage and retrieval of information from the data is done by the Hadoop Distributed File System and the Map Reduce Programming model. MapReduce provides effective bench marks for retrieving the information from the Big Data. In this paper we present our experimental work done on the Hadoop Cluster. We have analyzed the time required by the cluster for processing the data with increasing number of nodes into the cluster. We started with a single node and then increase the node by one each time. We have analyzed three types of time. The real time, user time, system time is analyzed.
  • Keywords
    Big Data; information retrieval; storage management; Big Data; Hadoop cluster; Hadoop distributed file system; MapReduce programming model; data processing; data storage; information retrieval; real time; system time; time based analysis; user time; Big data; Distributed databases; File systems; Google; Real-time systems; Sorting; Data Node; Hadoop Distributed File System; Job Tracker; MapReduce; Name Node; Task Tracker;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Communication Networks (CICN), 2014 International Conference on
  • Print_ISBN
    978-1-4799-6928-9
  • Type

    conf

  • DOI
    10.1109/CICN.2014.136
  • Filename
    7065556