DocumentCode :
75641
Title :
Budget-Driven Scheduling Algorithms for Batches of MapReduce Jobs in Heterogeneous Clouds
Author :
Yang Wang ; Wei Shi
Author_Institution :
Fac. of Comput. Sci., Univ. of New Brunswick, Fredericton, NB, Canada
Volume :
2
Issue :
3
fYear :
2014
fDate :
July-Sept. 1 2014
Firstpage :
306
Lastpage :
319
Abstract :
In this paper, we consider task-level scheduling algorithms with respect to budget and deadline constraints for a batch of MapReduce jobs on a set of provisioned heterogeneous (virtual) machines in cloud platforms. The heterogeneity is manifested in the popular “pay-as-you-go” charging model where the service machines with different performance would have different service rates. We organize the batch of jobs as a k-stage workflow and study two related optimization problems, depending on whether the constraints are on monetary budget or on scheduling length of the workflow. First, given a total monetary budget B, by combining an in-stage local greedy algorithm (whose optimality is also proven) and dynamic programming (DP) techniques, we propose a global optimal scheduling algorithm to achieve minimum scheduling length of the workflow within O(kB2). Although the optimal algorithm is efficient when B is polynomially bounded by the number of tasks in the MapReduce jobs, the quadratic time complexity is still high. To improve the efficiency, we further develop two greedy algorithms, called Global Greedy Budget (GGB) and Gradual Refinement (GR), each adopting different greedy strategies. In GGB we extend the idea of the local greedy algorithm to the efficient global distribution of the budget with minimum scheduling length as a goal whilst in GR we iteratively apply the DP algorithm to the distribution of exponentially reduced budget so that the solutions are gradually refined. Second, we consider the optimization problem of minimizing cost when the (time) deadline of the computation D is fixed. We convert this problem into the standard Multiple-Choice Knapsack Problem via a parallel transformation. Our empirical studies verify the proposed optimal algorithms and show the efficiencies of the greedy algorithms in cost-effectiveness to distribute the budget for performance optimizations of the MapReduce workflows.
Keywords :
cloud computing; computational complexity; dynamic programming; greedy algorithms; iterative methods; knapsack problems; parallel programming; scheduling; virtual machines; MapReduce jobs; budget constraint; budget-driven scheduling algorithms; cost minimization; deadline constraint; dynamic programming techniquee; global greedy budget; global optimal scheduling algorithm; gradual refinement; heterogeneous cloud computing; heterogeneous machines; in-stage local greedy algorithm; k-stage workflow; multiple-choice knapsack problem; optimization problems; pay-as-you-go charging model; quadratic time complexity; scheduling length; service machines; service rates; task-level scheduling algorithms; total monetary budget; virtual machines; Cloud computing; Greedy algorithms; Heuristic algorithms; Optimal scheduling; Scheduling; Scheduling algorithms; MapReduce scheduling; cloud computing; cost and time constraints; dynamic programming; optimal greedy algorithm; optimal parallel scheduling algorithm;
fLanguage :
English
Journal_Title :
Cloud Computing, IEEE Transactions on
Publisher :
ieee
ISSN :
2168-7161
Type :
jour
DOI :
10.1109/TCC.2014.2316812
Filename :
6787054
Link To Document :
بازگشت