Title :
Characterization and Optimization of Memory-Resident MapReduce on HPC Systems
Author :
Yandong Wang ; Goldstone, Robin ; Weikuan Yu ; Teng Wang
Abstract :
MapReduce is a widely accepted framework for addressing big data challenges. Recently, it has also gained broad attention from scientists at the U.S. leadership computing facilities as a promising solution to process gigantic simulation results. However, conventional high-end computing systems are constructed based on the compute-centric paradigm while big data analytics applications prefer a data-centric paradigm such as MapReduce. This work characterizes the performance impact of key differences between compute- and data-centric paradigms and then provides optimizations to enable a dual-purpose HPC system that can efficiently support conventional HPC applications and new data analytics applications. Using a state-of-the-art MapReduce implementation Spark and the Hyperion system at Lawrence Livermore National Laboratory, we have examined the impact of storage architectures, data locality and task scheduling to the memory-resident MapReduce jobs. Based on our characterization and findings of the performance behaviors, we have introduced two optimization techniques, namely Enhanced Load Balancer and Congestion-Aware Task Dispatching, to improve the performance of Spark applications.
Keywords :
data analysis; optimisation; parallel processing; resource allocation; Hyperion system; Lawrence Livermore National Laboratory; Spark applications; compute-centric paradigms; congestion-aware task dispatching; data analytics applications; data locality; data-centric paradigm; dual-purpose HPC system; enhanced load balancer; high-end computing systems; memory-resident MapReduce jobs; optimization techniques; performance behaviors; storage architectures; task scheduling; Benchmark testing; Big data; Computer architecture; Optimization; Processor scheduling; Servers; Sparks;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-3799-8
DOI :
10.1109/IPDPS.2014.87