Author_Institution :
Embedded Syst. Lab. (ESLAB), Linkoping Univ., Linköping, Sweden
Abstract :
This paper deals with the design of embedded systems for safety-critical applications, where both fault-tolerance and real-time requirements should be taken into account at the same time. With silicon technology scaling, integrated circuits are implemented with smaller transistors, operate at higher clock frequency, and run at lower voltage levels. As a result, they are subject to more faults, in particular, transient faults. Additionally, in nano-scale technology, physics-based random variations play an important role in many device performance metrics, and have led to many new defects. We are therefore facing the challenge of how to build reliable and predictable embedded systems for safety-critical applications with unreliable components. This paper describes several key challenges and presents several emerging solutions to the design and optimization of such systems. In particular, it discusses the advantages of using time-redundancy based fault-tolerance techniques that are triggered by fault occurrences to handle transient faults and the hardware/software trade-offs related to fault detection and fault tolerance.
Keywords :
embedded systems; fault tolerant computing; optimisation; power aware computing; clock frequency; device performance metrics; fault detection; fault-tolerance; integrated circuits; nanoscale technology; optimization; physics-based random variations; real-time requirements; reliable embedded systems; safety-critical applications; time-redundancy techniques; transient faults; transistors; Circuit faults; Embedded system; Fault tolerance; Fault tolerant systems; Hardware; Transient analysis;