DocumentCode :
3444206
Title :
Reliability enhancement of real-time multiprocessor systems through dynamic reconfiguration
Author :
Yu, Kai ; Koren, Israel
Author_Institution :
Dept. of Electr. & Comput. Eng., Massachusetts Univ., Amherst, MA, USA
fYear :
1994
fDate :
12-14 Jun 1994
Firstpage :
161
Lastpage :
168
Abstract :
Enhancing the reliability of a system executing real-time jobs is, in many cases, one of the most important design goals. A dynamically reconfigurable system offers an approach for improvement of reliability. To achieve high reliability the most suitable recovery action must be used when a fault occurs, which means that some kind of optimal recovery strategy should be followed. This is called a dynamic recovery strategy. To satisfy the service requirements of real-time jobs with hard deadlines, a more powerful system, intuitively, should always be preferred. On the other hand, higher processing capacity means more processing modules and electronics parts, which may result in more frequent faults and a higher risk that the system will fail to complete the real-time jobs prior to their deadline. In this paper, we investigate the reliability enhancement of a real-time distributed computing system with hard deadlines through the employment of dynamic recovery strategies. Since the classical reliability evaluation technique is not applicable to a dynamically reconfigurable system, we present a new approach to reliability evaluation. The results show that the optimal recovery policy can significantly improve the system´s reliability, that both the job arrival rate and the job´s deadline have significant effect on the optimal reliability and optimal policy and that for a given workload and deadline, the maximum of the system reliability can be achieved at a certain (optimal) configuration
Keywords :
fault tolerant computing; multiprocessing systems; performance evaluation; real-time systems; reconfigurable architectures; reliability; system recovery; distributed computing system; dynamic reconfiguration; dynamic recovery; hard deadlines; optimal recovery strategy; real-time multiprocessor systems; recovery action; reliability; reliability evaluation; Contracts; Distributed computing; Employment; Job design; Military computing; Multiprocessing systems; Physics computing; Power system management; Power system reliability; Real time systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Parallel and Distributed Systems, 1994., Proceedings of IEEE Workshop on
Conference_Location :
College Station, TX
Print_ISBN :
0-8186-6807-5
Type :
conf
DOI :
10.1109/FTPDS.1994.494487
Filename :
494487
Link To Document :
بازگشت