Title :
Assessing the reliability impacts of software fault-tolerance mechanisms
Author :
Mendiratta, Veena B.
Author_Institution :
Bell labs., Lucent Technol., Naperville, IL, USA
fDate :
30 Oct-2 Nov 1996
Abstract :
Telecommunications systems are characterized by highly stringent reliability requirements for system availability and defect rate. A combination of approaches is used to achieve high software reliability, namely, fault avoidance, fault removal and implementation of fault-tolerant mechanisms. This paper focuses on the implementation of software fault-tolerant mechanisms and analyzes the impact of these mechanisms on software reliability. Based on field data on the frequency of invocation of some fault-tolerant mechanisms, we present an escalating recovery model for predicting the impact of these mechanisms on lost calls. The key parameters of the model are: the software fault recovery coverage factor; the proportion of successful recoveries at each level and the calls lost at each recovery level. The output of the model is a distribution and average of the number of lost calls per software error. The applicability of this model to systems with high reliability has been validated; the applicability of the model to less reliable systems is an area for future work
Keywords :
software fault tolerance; system recovery; telecommunication computing; defect rate; escalating recovery model; fault avoidance; fault removal; fault-tolerant mechanisms; reliability impacts; software fault recovery coverage factor; software fault-tolerance mechanisms; software reliability; system availability; telecommunications systems; Availability; Data analysis; Data structures; Fault detection; Fault tolerance; Fault tolerant systems; Frequency; Predictive models; Software reliability; Telecommunication switching;
Conference_Titel :
Software Reliability Engineering, 1996. Proceedings., Seventh International Symposium on
Conference_Location :
White Plains, NY
Print_ISBN :
0-8186-7707-4
DOI :
10.1109/ISSRE.1996.558711