DocumentCode :
3664205
Title :
Scheduling Computational Workflows on Failure-Prone Platforms
Author :
Guillaume Aupy;Anne Benoit;Henri Casanova;Yves Robert
Author_Institution :
Ecole Normale Super. de Lyon, Lyon, France
fYear :
2015
fDate :
5/1/2015 12:00:00 AM
Firstpage :
473
Lastpage :
482
Abstract :
We study the scheduling of computational workflows on compute resources that experience exponentially distributed failures. When a failure occurs, rollback and recovery is used to resume the execution from the last check pointed state. The scheduling problem is to minimize the expected execution time by deciding in which order to execute the tasks in the workflow and whether to checkpoint or not checkpoint a task after it completes. We give a polynomial-time algorithm for fork graphs and show that the problem is NP-complete with join graphs. Our main result is a polynomial-time algorithm to compute the execution time of a workflow with specified to-be-check pointed tasks. Using this algorithm as a basis, we propose efficient heuristics for solving the scheduling problem. We evaluate these heuristics for representative workflow configurations.
Keywords :
"Program processors","Schedules","Checkpointing","Processor scheduling","Polynomials","Scheduling","Heuristic algorithms"
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International
Type :
conf
DOI :
10.1109/IPDPSW.2015.33
Filename :
7284346
Link To Document :
بازگشت