DocumentCode :
1081479
Title :
Optimal configuration of redundant real-time systems in the face of correlated failure
Author :
Krishna, C.M. ; Singh, A.D.
Author_Institution :
Dept. of Electr. & Comput. Eng., Massachusetts Univ., Amherst, MA, USA
Volume :
44
Issue :
4
fYear :
1995
fDate :
12/1/1995 12:00:00 AM
Firstpage :
587
Lastpage :
594
Abstract :
Real-time computers are frequently used in harsh environments, such as space or industry. Lightning strikes, streams of elementary particles, and other manifestations of a harsh operating environment can cause transient failures in processors. Since the entire system is in the same environment, an especially severe disturbance can result in a momentary, correlated, failure of all the processors. To have the system survive transient correlated failures and still execute all its critical workload on time, designers must use time redundancy. To survive permanent or transient independently-occurring failures, processor redundancy must be used, and the computer configured into redundant clusters. Given a fixed total number of processors, there is a tradeoff between processor- and time-redundancy, This paper considers the tradeoffs between configuring the system into duplexes and triplexes. There are pessimistic and optimistic reliability models for each configuration. For the range of pertinent parameters, these models are very close, indicating that these models are quite accurate. The duplex-tripler tradeoff is between the effects of permanent, independent-transient, and correlated-transient failures. Configuring the system in triplexes provides better protection against permanent and independent-transient failures, but diminishes protection against correlated-transient failures. The better configuration is given for each application
Keywords :
fault tolerant computing; real-time systems; redundancy; reliability; reliability theory; system recovery; correlated failure; elementary particle streams; harsh environments; lightning strikes; optimal configuration; optimistic reliability models; permanent independently-occurring failures; pessimistic reliability models; processor redundancy; real-time computers; redundant clusters; redundant real-time systems; time redundancy; transient correlated failures; transient independently-occurring failures; Aerospace electronics; Application software; Automatic control; Electromagnetic radiation; Electromagnetic transients; Fault tolerant systems; Lightning; Real time systems; Redundancy; Space vehicles;
fLanguage :
English
Journal_Title :
Reliability, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9529
Type :
jour
DOI :
10.1109/24.475977
Filename :
475977
Link To Document :
بازگشت