DocumentCode :
2805897
Title :
Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems
Author :
Zhao, Laiping ; Ren, Yizhi ; Xiang, Yang ; Sakurai, Kouichi
Author_Institution :
Dept. of Inf., Kyushu Univ., Fukuoka, Japan
fYear :
2010
fDate :
1-3 Sept. 2010
Firstpage :
434
Lastpage :
441
Abstract :
In the existing studies on fault-tolerant scheduling, the active replication schema makes use of ε + 1 replicas for each task to tolerate E failures. However, in this paper, we show that it does not always lead to a higher reliability with more replicas. Besides, the more replicas implies more resource consumption and higher economic cost. To address this problem, with the target to satisfy the user´s reliability requirement with minimum resources, this paper proposes a new fault tolerant scheduling algorithm: MaxRe. In the algorithm, we incorporate the reliability analysis into the active replication schema, and exploit a dynamic number of replicas for different tasks. Both the theoretical analysis and experiments prove that the MaxRe algorithm´s schedule can certainly satisfy user´s reliability requirements. And the MaxRe scheduling algorithm can achieve the corresponding reliability with at most 70% fewer resources than the FTSA algorithm.
Keywords :
cloud computing; fault tolerant computing; formal specification; formal verification; processor scheduling; software reliability; FTSA algorithm; MaxRe; active replication schema; fault tolerant scheduling; heterogeneous system; user reliability requirement; Algorithm design and analysis; Fault tolerance; Program processors; Schedules; Scheduling algorithm; Fault-tolerance; Heterogeneous system; Reliability; Resource scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communications (HPCC), 2010 12th IEEE International Conference on
Conference_Location :
Melbourne, VIC
Print_ISBN :
978-1-4244-8335-8
Type :
conf
DOI :
10.1109/HPCC.2010.72
Filename :
5738914
Link To Document :
بازگشت