Title :
Reliability simulation of fault-tolerant software and systems
Author :
Gokhale, Swapna S. ; Lyu, Michael R. ; Trivedi, Kishor S.
Author_Institution :
Dept. of Electr. & Comput. Eng., Duke Univ., Durham, NC, USA
Abstract :
Fault tolerance is a survival attribute of complex computer systems and software in their ability to deliver continuous service to their users in the presence of faults. Formulating an analytic model for dependability and performance evaluation of hardware/software fault tolerant architectures can be quite cumbersome. Also, in practice, isolating the effect of various parameters on a system, while holding the others constant requires exploring a variety of scenarios. It is economically infeasible to build several such systems. Simulation offers an attractive mechanism for dependability evaluation and the study of the influence of various parameters on the failure behavior of the system. In this paper, we develop algorithms to simulate the failure behavior of three commonly used fault tolerant architectures, viz., Distributed Recovery Block (DRB), N-Version Programming (NVP) and N-Self Checking Programming (NSCP). We demonstrate the ability of the approach to simulate complex failure scenarios with various dependencies using some illustrative numerical examples
Keywords :
fault tolerant computing; performance evaluation; reliability; virtual machines; Distributed Recovery Block; N-Self Checking Programming; N-Version Programming; complex failure scenarios; continuous service; dependability; failure behavior; fault tolerant architectures; fault-tolerant software; reliability simulation; survival attribute; Computational modeling; Computer architecture; Embedded software; Fault tolerance; Fault tolerant systems; Hardware; Liver; Performance analysis; Software maintenance; Software systems;
Conference_Titel :
Fault-Tolerant Systems, 1997. Proceedings., Pacific Rim International Symposium on
Conference_Location :
Taipei
Print_ISBN :
0-8186-8212-4
DOI :
10.1109/PRFTS.1997.640143