Title :
HW/SW co-detection of transient and permanent faults with fast recovery in statically scheduled data paths
Author_Institution :
Dept. of Comput. Sci., Brandenburg Univ. of Technol., Cottbus, Germany
Abstract :
This paper describes a hardware-/software-based technique to make the data path of a statically scheduled super scalar processor fault tolerant. The results of concurrently executed operations can be compared with little hardware overhead in order to detect a transient or permanent fault. Furthermore, the hardware extension allows to recover from a fault within one to two clock cycles and to distinguish between transient and permanent faults. If a permanent fault was detected, this fault is masked for the rest of the program execution such that no further time is needed for recovering from that fault. The proposed extensions were implemented in the data path of a simple VLIW processor in order to prove the feasibility and to determine the hardware overhead. Finally a reliability analysis is presented. It shows that for medium and large scaled data paths our extension provides an up to 98% better reliability than triple modular redundancy.
Keywords :
fault diagnosis; fault tolerance; hardware-software codesign; reliability; scheduling; statistical analysis; HW-SW codetection; VLIW processor; hardware overhead; hardware-software-based technique; permanent faults; reliability analysis; statically scheduled data paths; super scalar processor fault tolerance; triple modular redundancy; Application specific processors; Circuit faults; Clocks; Embedded system; Fault detection; Hardware; Power system reliability; Processor scheduling; Production; VLIW;
Conference_Titel :
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2010
Conference_Location :
Dresden
Print_ISBN :
978-1-4244-7054-9
DOI :
10.1109/DATE.2010.5456957