Title :
HAFTA: Highly Available Fault-Tolerant Architecture to Protect SRAM-Based Reconfigurable Devices Against Multiple Bit Upsets
Author :
Ghaderi, Zana ; Miremadi, Seyed Ghassem ; Asadi, Hamed ; Fazeli, Mehdi
Author_Institution :
Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran, Iran
Abstract :
Despite widespread use of SRAM-based reconfigurable devices (SRDs) in mainstream applications, their usage has been very limited in enterprise and safety-critical applications due to SRAM susceptibility to soft errors. Previous mitigation techniques to protect SRDs impose significant area and power overheads. Additionally, they suffer from susceptibility of configuration bits to multiple bit upsets (MBUs). In this paper, we present a highly available fault-tolerant architecture to protect SRD-based designs against MBUs in both configuration and user bits. In the proposed architecture, the entire design is duplicated with respect to the relative locations of logic blocks within the SRD and the main and replica flip-flops (FFs) are compared at each clock cycle to detect any possible mismatch. In addition, the unused FFs available throughout SRDs are employed as history FFs to save the latest correct state of the system. Upon detection of any mismatch between the main and replica FFs, the system is able to roll back to the latest correct state stored in the history FFs. The simulation results extracted using fault injection experiments demonstrate that the proposed architecture provides both higher reliability and availability, as compared with the traditional triple modular redundancy techniques, while offering less area and power overheads.
Keywords :
SRAM chips; failure analysis; fault tolerance; flip-flops; integrated circuit design; integrated circuit reliability; FF; HAFTA; MBU; SRAM-based reconfigurable devices; SRD; SRD-based designs; area overheads; fault injection experiments; highly available fault-tolerant architecture; logic blocks; mitigation techniques; multiple bit upsets; power overheads; reliability; replica flip-flops; safety-critical applications; soft errors; triple modular redundancy techniques; Availability; Circuit faults; Clocks; Computer architecture; Redundancy; Tunneling magnetoresistance; Availability; SRAM-based reconfigurable devices (SRDs); multiple bit upsets (MBUs); reliability;
Journal_Title :
Device and Materials Reliability, IEEE Transactions on
DOI :
10.1109/TDMR.2012.2229710