• DocumentCode
    1812776
  • Title

    An analytical performance model of MapReduce

  • Author

    Yang, Xiao ; Sun, Jianling

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
  • fYear
    2011
  • fDate
    15-17 Sept. 2011
  • Firstpage
    306
  • Lastpage
    310
  • Abstract
    MapReduce is a distributed computing framework. Its application in distributed systems is a rapidly emerging field. Although this framework can leverage clusters to improve computing performance, tuning it is still challenging. Most current works related to MapReduce performance are based on system monitoring and simulation, and lack analytical performance models. In this paper, we propose a simple and general MapReduce performance model for better understanding the impact of each component on overall program performance, and verify it in a small cluster. The results indicate that our model can predict the performance of MapReduce system and its relation to the configuration. According to our model, performance can be improved significantly by modifying Map split granularity and number of reducers without modifying the framework. The model also points out potential bottlenecks of the framework and future improvement for better performance.
  • Keywords
    distributed processing; computing performance; distributed computing framework; general MapReduce performance model; map split granularity; system monitoring; system simulation; Analytical models; Companies; Computational modeling; Data models; Distributed databases; Pipelines; Predictive models; MapReduce; distributed computing; performance model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-61284-203-5
  • Type

    conf

  • DOI
    10.1109/CCIS.2011.6045080
  • Filename
    6045080