DocumentCode :
3783383
Title :
Off-line real-time fault-tolerant scheduling
Author :
C. Dima;A. Girault;C. Lavarenne;Y. Sorel
Author_Institution :
INRIA, Montbonnot St. Martin, France
fYear :
2001
fDate :
6/23/1905 12:00:00 AM
Firstpage :
410
Lastpage :
417
Abstract :
We address the problem of off-line fault tolerant scheduling of an algorithm onto a multiprocessor architecture with distributed memory and provide a generic algorithm which solves this problem. We take into account two kinds of failures: fail-silent and omission. The basic technique we use is the replication of operations and data communications. We then discuss the principles which govern the execution of schedulings with replication under the state-machine and the primary/backup arbitrations between replicas. We also show how to compute the execution date for each operation and the timeouts which are used for detecting failures. We end with a heuristic which, using this calculus, computes a possibly non optimal scheduling by finding plain schedulings for each failure pattern and then combining them into a scheduling with replication.
Keywords :
"Fault tolerance","Processor scheduling","Scheduling algorithm","Fault tolerant systems","Protocols","Topology","Heuristic algorithms","Time factors","NP-complete problem","Hardware"
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing, 2001. Proceedings. Ninth Euromicro Workshop on
Print_ISBN :
0-7695-0987-8
Type :
conf
DOI :
10.1109/EMPDP.2001.905069
Filename :
905069
Link To Document :
بازگشت