DocumentCode :
3051333
Title :
Fault Tolerant Clustering in Scientific Workflows
Author :
Chen, Weiwei ; Deelman, Ewa
Author_Institution :
Inf. Sci. Inst., Univ. of Southern California, Marina del Rey, CA, USA
fYear :
2012
fDate :
24-29 June 2012
Firstpage :
9
Lastpage :
16
Abstract :
Task clustering has been proven to be an effective method to reduce execution overhead and increase the computational granularity of workflow tasks executing on distributed resources. However, a job composed of multiple tasks may have a greater risk of suffering from failures than a job composed of a single task. Our theoretic analysis and simulation results demonstrate that failures can have a significant impact on the runtime performance of workflows that use existing clustering policies that ignore failures. We therefore propose two general failure modeling frameworks (task failure model and job failure model) to address these performance issues. We show the necessity to consider the fault tolerance in the task failure model. Based on the task failure model, we propose three methods to improve the workflow performance in dynamic environments. A simulation-based evaluation is performed and it shows that our approach can improve the workflow makespan significantly for two important applications.
Keywords :
fault tolerant computing; pattern clustering; task analysis; computational granularity; execution overhead; failure modeling frameworks; fault tolerant clustering; job failure model; runtime performance; scientific workflows; simulation-based evaluation; task clustering; task failure model; Computational modeling; Engines; Fault tolerance; Fault tolerant systems; Runtime; Strontium; Transient analysis; clustering; fault tolerance; task failure; workflow;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Services (SERVICES), 2012 IEEE Eighth World Congress on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4673-3053-4
Type :
conf
DOI :
10.1109/SERVICES.2012.5
Filename :
6274026
Link To Document :
بازگشت