• DocumentCode
    668145
  • Title

    Zput: A speedy data uploading approach for the Hadoop Distributed File System

  • Author

    Youwei Wang ; Weiping Wang ; Can Ma ; Dan Meng

  • Author_Institution
    Integration Applic. Center, Inst. Of Comput. Technol., Beijing, China
  • fYear
    2013
  • fDate
    23-27 Sept. 2013
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Hadoop Distributed File System (HDFS) is the storage component of the Hadoop framework, which is designed for maintaining and processing huge datasets efficiently among cluster nodes. To cooperate with MapReduce, the computation infrastructure of Hadoop, data is required to be uploaded from local file systems to HDFS. Unfortunately when data is of massive scale, the uploading procedure becomes extremely time-consuming, which causes serious delay for urgent tasks. This primary contribution of this paper is the proposition of Zput, a speedy data uploading mechanism which can significantly accelerate uploading by using metadata mapping approach. After the implementation is described and corresponding advantages are narrated, disadvantages are also analyzed and eliminated by using an approach named remote block placement. Evaluation results show this new mechanism can reduce the running time of uploading process by about 60-90%, and the remote block placement can boost the course of block distribution by about 30-40%, while maintaining the complete compatibility for upper-layer applications.
  • Keywords
    distributed databases; meta data; storage management; HDFS; Hadoop distributed file system; Zput; block distribution; computation infrastructure; dataset maintenance; dataset processing; metadata mapping approach; remote block placement; speedy data uploading approach; storage component; upper-layer applications; Cryptography; IP networks; Reliability; Switches; Block Replication and Placement; Distributed File System; Metadata Manipulation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2013 IEEE International Conference on
  • Conference_Location
    Indianapolis, IN
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2013.6702648
  • Filename
    6702648