Title :
An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems
Author :
Qin, Xiao ; Jiang, Hong ; Swanson, David R.
Author_Institution :
Dept. of Comput. Sci. & Eng., Nebraska Univ., Lincoln, NE, USA
Abstract :
In this paper, we investigate an efficient off-line scheduling algorithm in which real-time tasks with precedence constraints are executed in a heterogeneous environment. It provides more features and capabilities than existing algorithms that schedule only independent tasks in real-time homogeneous systems. In addition, the proposed algorithm takes the heterogeneities of computation, communication and reliability into account, thereby improving the reliability. To provide fault-tolerant capability, the algorithm employs a primary-backup copy scheme that enables the system to tolerate permanent failures in any single processor. In this scheme, a backup copy is allowed to overlap with other backup copies on the same processor, as long as their corresponding primary copies are allocated to different processors. Tasks are judiciously allocated to processors so as to reduce the schedule length as well as the reliability cost, defined to be the product of processor failure rate and task execution time. In addition, the time for detecting and handling a permanent fault is incorporated into the scheduling scheme, thus making the algorithm more practical. To quantify the combined performance of fault-tolerance and schedulability, the performability measure is introduced Compared with the existing scheduling algorithms in the literature, our scheduling algorithm achieves an average of 16.4% improvement in reliability and an average of 49.3% improvement in performability.
Keywords :
back-up procedures; fault tolerant computing; parallel algorithms; real-time systems; scheduling; communication; computation; efficient fault-tolerant off-line scheduling algorithm; heterogeneous environment; performability measure; permanent failure tolerance; precedence constraints; primary backup copy scheme; processor failure rate; real-time tasks; reliability; reliability cost; schedulability; schedule length; task allocation; task execution time; Computer science; Costs; Distributed computing; Fault detection; Fault tolerance; Fault tolerant systems; Performance evaluation; Processor scheduling; Real time systems; Scheduling algorithm;
Conference_Titel :
Parallel Processing, 2002. Proceedings. International Conference on
Print_ISBN :
0-7695-1677-7
DOI :
10.1109/ICPP.2002.1040892