DocumentCode :
3140755
Title :
Optimal Algorithms for Cross-Rack Communication Optimization in MapReduce Framework
Author :
Ho, Li-Yung ; Wu, Jan-Jan ; Liu, Pangfeng
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan
fYear :
2011
fDate :
4-9 July 2011
Firstpage :
420
Lastpage :
427
Abstract :
MapReduce is a widely used data-parallel programming model for large-scale data analysis. The framework is shown to be scalable to thousand of computing nodes and reliable on commodity clusters. However, research has shown that there is room for performance improvement of the MapReduce framework. One of the main performance bottlenecks is caused by the all-to-all communication between mappers and reducers, which may saturate the top-of-rack switch and inflate job execution time. Reducing cross-rack communication will improve job performance. In current MapReduce implementation, the task assignment is based on the pull-model, in which cross-rack traffic is difficult to control. In contrast, the MapReduce framework allows more flexibility in assigning reducers to the computing nodes. In this paper, we investigate the reducer placement problem (RPP), which considers the placement of reducers to minimize cross-rack traffic. We devise two optimal algorithms to solve RPP and implement the algorithms in the Hadoop system. We also propose an analytical solution for this problem. Our experiment results with a set of MapReduce applications show that our optimization achieves 9% to 32%performance improvement compared with the unoptimized Hadoop.
Keywords :
cloud computing; data analysis; parallel programming; public domain software; Hadoop system; MapReduce framework; commodity clusters; cross-rack communication optimization; cross-rack traffic minimization; data analysis; data-parallel programming model; job execution time inflation; optimal algorithms; pull-model; reducer placement problem; task assignment; top-of-rack switch; Data models; Equations; Greedy algorithms; Manganese; Optimization; Silicon; Switches; MapReduce optimization; cloud computing; cross-rack communication; network load balancing; optimal algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing (CLOUD), 2011 IEEE International Conference on
Conference_Location :
Washington, DC
ISSN :
2159-6182
Print_ISBN :
978-1-4577-0836-7
Electronic_ISBN :
2159-6182
Type :
conf
DOI :
10.1109/CLOUD.2011.17
Filename :
6008738
Link To Document :
بازگشت