Title :
An analytical performance model of MapReduce
Author :
Yang, Xiao ; Sun, Jianling
Author_Institution :
Dept. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
Abstract :
MapReduce is a distributed computing framework. Its application in distributed systems is a rapidly emerging field. Although this framework can leverage clusters to improve computing performance, tuning it is still challenging. Most current works related to MapReduce performance are based on system monitoring and simulation, and lack analytical performance models. In this paper, we propose a simple and general MapReduce performance model for better understanding the impact of each component on overall program performance, and verify it in a small cluster. The results indicate that our model can predict the performance of MapReduce system and its relation to the configuration. According to our model, performance can be improved significantly by modifying Map split granularity and number of reducers without modifying the framework. The model also points out potential bottlenecks of the framework and future improvement for better performance.
Keywords :
distributed processing; computing performance; distributed computing framework; general MapReduce performance model; map split granularity; system monitoring; system simulation; Analytical models; Companies; Computational modeling; Data models; Distributed databases; Pipelines; Predictive models; MapReduce; distributed computing; performance model;
Conference_Titel :
Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-61284-203-5
DOI :
10.1109/CCIS.2011.6045080