Title :
Comparing the effects of intermittent and transient hardware faults on programs
Author :
Wei, Jiesheng ; Rashid, Layali ; Pattabiraman, Karthik ; Gopalakrishnan, Sathish
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of British Columbia, Vancouver, BC, Canada
Abstract :
The trends of shrinking device geometries, lower voltages and higher frequencies in modern processors are expected to increase the rate of intermittent faults. This requires the design of software that are resilient to intermittent faults. There has been substantial research on software systems that are resilient to transient faults. However, it is unclear whether the impact of intermittent faults on programs is similar to that of transient faults. This is important for deciding if we need novel techniques for tolerating intermittent faults in software. In this study, we attempt to answer this question by comparing the effects of intermittent and transient hardware faults on programs through fault-injection experiments performed in a micro-architectural simulator for a simple five-stage pipelined processor. We also investigate whether the differences (if any) vary with the length (i.e., duration in cycles) of the fault and with the micro-architectural unit in which the fault originates. The result show that intermittent faults´ impact on programs are significantly different from those of transient faults, and that the difference depends both on the length of the fault and the fault´s origin. Therefore, existing software techniques for ensuring resilience from transient faults may not be sufficient for intermittent faults, and new techniques are needed.
Keywords :
pipeline processing; program diagnostics; software fault tolerance; fault injection experiment; five-stage pipelined processor; intermittent hardware fault; microarchitectural simulator; software design; software fault tolerance; transient hardware fault; Benchmark testing; Circuit faults; Clocks; Computer crashes; Program processors; Transient analysis; Intermittent fault; Micro-architectural-level fault injection; Transient fault;
Conference_Titel :
Dependable Systems and Networks Workshops (DSN-W), 2011 IEEE/IFIP 41st International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4577-0374-4
Electronic_ISBN :
978-1-4577-0373-7
DOI :
10.1109/DSNW.2011.5958835