DocumentCode :
3199407
Title :
High-Performance Design of YARN MapReduce on Modern HPC Clusters with Lustre and RDMA
Author :
Wasi-ur-Rahman, Md ; Xiaoyi Lu ; Islam, Nusrat Sharmin ; Rajachandrasekar, Raghunath ; Panda, Dhabaleswar K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2015
fDate :
25-29 May 2015
Firstpage :
291
Lastpage :
300
Abstract :
The viability and benefits of running MapReduce over modern High Performance Computing (HPC) clusters, with high performance interconnects and parallel file systems, have attracted much attention in recent times due to its uniqueness of solving data analytics problems with a combination of Big Data and HPC technologies. Most HPC clusters follow the traditional Beowulf architecture with a separate parallel storage system (e.g. Lustre) and either no, or very limited, local storage. Since the MapReduce architecture relies heavily on the availability of local storage media, the Lustre-based global storage system in HPC clusters poses many new opportunities and challenges. In this paper, we propose a novel high-performance design for running YARN MapReduce on such HPC clusters by utilizing Lustre as the storage provider for intermediate data. We identify two different shuffle strategies, RDMA and Lustre Read, for this architecture and provide modules to dynamically detect the best strategy for a given scenario. Our results indicate that due to the performance characteristics of the underlying Lustre setup, one shuffle strategy may outperform another in different HPC environments, and our dynamic detection mechanism can deliver best performance based on the performance characteristics obtained during runtime of job execution. Through this design, we can achieve 44% performance benefit for shuffle-intensive workloads in leadership-class HPC systems. To the best of our knowledge, this is the first attempt to exploit performance characteristics of alternate shuffle strategies for YARN MapReduce with Lustre and RDMA.
Keywords :
Big Data; data analysis; parallel processing; storage management; Beowulf architecture; Big Data; HPC clusters; Lustre read; Lustre-based global storage system; MapReduce architecture; RDMA; YARN MapReduce; data analytics problems; dynamic detection mechanism; high performance computing clusters; high-performance design; leadership-class HPC systems; local storage media; parallel file systems; parallel storage system; shuffle strategies; shuffle-intensive workloads; Bandwidth; Big data; Computer architecture; Data analysis; File systems; Servers; Yarn; HPC Clusters; Lustre; MapReduce; RDMA; YARN;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
Conference_Location :
Hyderabad
ISSN :
1530-2075
Type :
conf
DOI :
10.1109/IPDPS.2015.83
Filename :
7161518
Link To Document :
بازگشت