• DocumentCode
    3724245
  • Title

    A MapReduce Optimization Method on Hadoop Cluster

  • Author

    Xiaodong Wu

  • Author_Institution
    Fujian Provincial Key Lab. of Data Intensive Comput. Key Lab. of Intell. Comput. &
  • fYear
    2015
  • Firstpage
    18
  • Lastpage
    21
  • Abstract
    The MapReduce parallel and distributed computing framework has been widely applied in both academia and industry. MapReduce applications are divided into two steps: Map and Reduce. Then, the input data is divided into splits, which can be concurrently processed, and the amount of the splits determines the number of map tasks. In this paper, we present a regression-based method to compute the number of Map tasks as well as Reduce tasks such that the performance of the MapReduce application can be improved. The regression analysis is used to predict the executing time of MapReduce applications. Experimental results show that the proposed optimization method can effectively reduce the execution time of the applications.
  • Keywords
    "Optimization methods","Hardware","Distributed databases","Programming","Data models","Parallel processing"
  • Publisher
    ieee
  • Conference_Titel
    Industrial Informatics - Computing Technology, Intelligent Technology, Industrial Information Integration (ICIICII), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/ICIICII.2015.92
  • Filename
    7373780