DocumentCode
1812776
Title
An analytical performance model of MapReduce
Author
Yang, Xiao ; Sun, Jianling
Author_Institution
Dept. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
fYear
2011
fDate
15-17 Sept. 2011
Firstpage
306
Lastpage
310
Abstract
MapReduce is a distributed computing framework. Its application in distributed systems is a rapidly emerging field. Although this framework can leverage clusters to improve computing performance, tuning it is still challenging. Most current works related to MapReduce performance are based on system monitoring and simulation, and lack analytical performance models. In this paper, we propose a simple and general MapReduce performance model for better understanding the impact of each component on overall program performance, and verify it in a small cluster. The results indicate that our model can predict the performance of MapReduce system and its relation to the configuration. According to our model, performance can be improved significantly by modifying Map split granularity and number of reducers without modifying the framework. The model also points out potential bottlenecks of the framework and future improvement for better performance.
Keywords
distributed processing; computing performance; distributed computing framework; general MapReduce performance model; map split granularity; system monitoring; system simulation; Analytical models; Companies; Computational modeling; Data models; Distributed databases; Pipelines; Predictive models; MapReduce; distributed computing; performance model;
fLanguage
English
Publisher
ieee
Conference_Titel
Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-61284-203-5
Type
conf
DOI
10.1109/CCIS.2011.6045080
Filename
6045080
Link To Document