Author_Institution :
Dipt. di Inf. e Sist., Univ. degli Studi di Napoli Federico II, Naples, Italy
Abstract :
Fault Tolerance Mechanisms (FTMs) are extensively used in software systems to counteract software faults, in particular against faults that manifest transiently, namely Mandelbugs. In this scenario, Software Fault Injection (SFI) plays a key role for the verification and the improvement of FTMs. However, no previous work investigated whether SFI techniques are able to emulate Mandelbugs adequately. This is an important concern for assessing critical systems, since Mandelbugs are a major cause of failures, and FTMs are specifically tailored for this class of software faults. In this paper, we analyze an existing state-of-the-art SFI technique, namely G-SWFIT, in the context of a real-world fault-tolerant system for Air Traffic Control (ATC). The analysis highlights limitations of G-SWFIT regarding its ability to emulate the transient nature of Mandelbugs, because most of injected faults are activated in the early phase of execution, and they deterministically affect process replicas in the system. We also notice that G-SWFIT leaves untested the 35% of states of the considered system. Moreover, by means of an experiment, we show how emulation of Mandelbugs is useful to improve SFI. In particular, we emulate concurrency faults, which are a critical sub-class of Mandelbugs, in a fully representative way. We show that proper fault triggering can increase the confidence in FTMs´ testing, since it is possible to reduce the amount of untested states down to 5%.
Keywords :
air traffic control; software fault tolerance; G-SWFIT; Mandelbugs; air traffic control; concurrency faults; dependability assessment; fault tolerance mechanisms; sofiware fault injection; software systems; transient software faults emulation; Air traffic control; Computer bugs; Concurrent computing; Emulation; Fault tolerant systems; Redundancy; Reproducibility of results; Software systems; Testing; Transient analysis; Dependability Assessment; Fault Tolerance; Mandelbugs; Software Fault Injection; Software Faults;