Title :
Off-line real-time fault-tolerant scheduling
Author :
C. Dima;A. Girault;C. Lavarenne;Y. Sorel
Author_Institution :
INRIA, Montbonnot St. Martin, France
fDate :
6/23/1905 12:00:00 AM
Abstract :
We address the problem of off-line fault tolerant scheduling of an algorithm onto a multiprocessor architecture with distributed memory and provide a generic algorithm which solves this problem. We take into account two kinds of failures: fail-silent and omission. The basic technique we use is the replication of operations and data communications. We then discuss the principles which govern the execution of schedulings with replication under the state-machine and the primary/backup arbitrations between replicas. We also show how to compute the execution date for each operation and the timeouts which are used for detecting failures. We end with a heuristic which, using this calculus, computes a possibly non optimal scheduling by finding plain schedulings for each failure pattern and then combining them into a scheduling with replication.
Keywords :
"Fault tolerance","Processor scheduling","Scheduling algorithm","Fault tolerant systems","Protocols","Topology","Heuristic algorithms","Time factors","NP-complete problem","Hardware"
Conference_Titel :
Parallel and Distributed Processing, 2001. Proceedings. Ninth Euromicro Workshop on
Print_ISBN :
0-7695-0987-8
DOI :
10.1109/EMPDP.2001.905069