DocumentCode
3697052
Title
A Resource Supply-Demand based Approach for Automatic MapReduce Job Optimization
Author
Jinjun Xiong;Dzung T. Phan;David Kung
Author_Institution
IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
fYear
2015
Firstpage
740
Lastpage
745
Abstract
With the prevalence of big data, MapReduce has emerged as the most widely deployed computing framework for data analysts. This paper addresses MapReduce job performance optimization, targeting system latency reduction. We design a systematic method to optimize MapReduce job execution process by maximizing the utilization of computing resources. Through careful analysis of the mechanism behind Hadoop, the map-shuffle-reduce work-flow is formalized based on the resource supply-demand relations. Efficient and effective algorithms are developed to address the optimization using mixed integer nonlinear programming. Experiments on a ten-node cluster demonstrate that the proposed model achieves consistently improved performance, and significantly outperforms the system with default parameter setting.
Keywords
"Tuning","Optimization","Computational modeling","Containers","Big data","Delays","Programming"
Publisher
ieee
Conference_Titel
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
Type
conf
DOI
10.1109/HPCC-CSS-ICESS.2015.130
Filename
7336246
Link To Document