Title :
Rethinking error injection for effective resilience
Author :
Mirkhani, Shahrzad ; Hyungmin Cho ; Mitra, Subhasish ; Abraham, J.A.
Author_Institution :
ECE Dept., Univ. of Texas at Austin, Austin, TX, USA
Abstract :
Soft errors, caused by radiation, have become a major challenge in today´s computer systems and networking equipment, making it imperative that systems be designed to be resilient to errors. Error injection is a powerful approach to evaluate system resilience, and current practice is to inject errors in architectural registers of processors, program variables of applications, or storage elements in the hardware model. This paper, using answers to frequently asked questions, discusses the need for rethinking conventional approaches to error injection, showing data from recent research and our simulation results. Approaches to improving current error injections are also suggested.
Keywords :
software fault tolerance; architectural registers; computer systems; error injection; networking equipment; program variables; soft errors; storage elements; system resilience evaluation; Analytical models; Computational modeling; Computers; Hardware; Program processors; Registers; Resilience;
Conference_Titel :
Design Automation Conference (ASP-DAC), 2014 19th Asia and South Pacific
Conference_Location :
Singapore
DOI :
10.1109/ASPDAC.2014.6742922