Title :
DREAMS: Dynamic resource allocation for MapReduce with data skew
Author :
Zhihong Liu ; Qi Zhang ; Zhani, Mohamed Faten ; Boutaba, Raouf ; Yaping Liu ; Zhenghu Gong
Author_Institution :
Coll. of Comput., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
MapReduce has become a popular model for large-scale data processing in recent years. However, existing MapRe-duce schedulers still suffer from an issue known as partitioning skew, where the output of map tasks is unevenly distributed among reduce tasks. In this paper, we present DREAMS, a framework that provides run-time partitioning skew mitigation. Unlike previous approaches that try to balance the workload of reducers by repartitioning the intermediate data assigned to each reduce task, in DREAMS we cope with partitioning skew by adjusting task run-time resource allocation. We show that our approach allows DREAMS to eliminate the overhead of data repartitioning. Through experiments using both real and synthetic workloads running on a 11-node virtual virtualised Hadoop cluster, we show that DREAMS can effectively mitigate negative impact of partitioning skew, thereby improving job performance by up to 20.3%.
Keywords :
data handling; parallel processing; pattern clustering; resource allocation; virtualisation; DREAMS; MapReduce; data repartitioning; data skew; dynamic resource allocation; large-scale data processing; map task; reduce task; run-time partitioning skew mitigation; task run-time resource allocation; virtual virtualised Hadoop cluster; Biomedical monitoring; Containers; Mathematical model; Monitoring; Predictive models; Resource management; Yarn;
Conference_Titel :
Integrated Network Management (IM), 2015 IFIP/IEEE International Symposium on
Conference_Location :
Ottawa, ON
DOI :
10.1109/INM.2015.7140272