مرکز منطقه ای اطلاع رساني علوم و فناوري - Improving MapReduce Performance Using Smart Speculative Execution Strategy

DocumentCode :

25551

Title :

Improving MapReduce Performance Using Smart Speculative Execution Strategy

Author :

Qi Chen ; Cheng Liu ; Zhen Xiao

Author_Institution :

Dept. of Comput. Sci., Peking Univ., Beijing, China

Volume :

Issue :

fYear :

2014

fDate :

Apr-14

Firstpage :

954

Lastpage :

967

Abstract :

MapReduce is a widely used parallel computing framework for large scale data processing. The two major performance metrics in MapReduce are job execution time and cluster throughput. They can be seriously impacted by straggler machines-machines on which tasks take an unusually long time to finish. Speculative execution is a common approach for dealing with the straggler problem by simply backing up those slow running tasks on alternative machines. Multiple speculative execution strategies have been proposed, but they have some pitfalls: (i) Use average progress rate to identify slow tasks while in reality the progress rate can be unstable and misleading, (ii) Cannot appropriately handle the situation when there exists data skew among the tasks, (iii) Do not consider whether backup tasks can finish earlier when choosing backup worker nodes. In this paper, we first present a detailed analysis of scenarios where existing strategies cannot work well. Then we develop a new strategy, maximum cost performance (MCP), which improves the effectiveness of speculative execution significantly. To accurately and promptly identify stragglers, we provide the following methods in MCP: (i) Use both the progress rate and the process bandwidth within a phase to select slow tasks, (ii) Use exponentially weighted moving average (EWMA) to predict process speed and calculate a task´s remaining time, (iii) Determine which task to backup based on the load of a cluster using a cost-benefit model. To choose proper worker nodes for backup tasks, we take both data locality and data skew into consideration. We evaluate MCP in a cluster of 101 virtual machines running a variety of applications on 30 physical servers. Experiment results show that MCP can run jobs up to 39 percent faster and improve the cluster throughput by up to 44 percent compared to Hadoop-0.21.

Keywords :

cost-benefit analysis; moving average processes; parallel programming; pattern clustering; software performance evaluation; virtual machines; EWMA; MCP; MapReduce performance improvement; average progress rate; backup worker nodes; cluster load; cluster throughput; cost-benefit model; data locality; data skew; exponentially weighted moving average; job execution time; large-scale data processing; maximum cost performance; parallel computing framework; performance metrics; physical servers; process bandwidth; process speed prediction; progress rate; slow task identification; smart speculative execution strategy; straggler machines; task remaining time calculation; virtual machines; Algorithm design and analysis; Indexes; Optimization; Real-time systems; Redundancy; Silicon; Time factors; MapReduce; cluster throughput; cost performance; speculative execution; straggler;

fLanguage :

English

Journal_Title :

Computers, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9340

Type :

jour

DOI :

10.1109/TC.2013.15

Filename :

6419699

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=25551