DocumentCode
74137
Title
HAFTA: Highly Available Fault-Tolerant Architecture to Protect SRAM-Based Reconfigurable Devices Against Multiple Bit Upsets
Author
Ghaderi, Zana ; Miremadi, Seyed Ghassem ; Asadi, Hamed ; Fazeli, Mehdi
Author_Institution
Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran, Iran
Volume
13
Issue
1
fYear
2013
fDate
Mar-13
Firstpage
203
Lastpage
212
Abstract
Despite widespread use of SRAM-based reconfigurable devices (SRDs) in mainstream applications, their usage has been very limited in enterprise and safety-critical applications due to SRAM susceptibility to soft errors. Previous mitigation techniques to protect SRDs impose significant area and power overheads. Additionally, they suffer from susceptibility of configuration bits to multiple bit upsets (MBUs). In this paper, we present a highly available fault-tolerant architecture to protect SRD-based designs against MBUs in both configuration and user bits. In the proposed architecture, the entire design is duplicated with respect to the relative locations of logic blocks within the SRD and the main and replica flip-flops (FFs) are compared at each clock cycle to detect any possible mismatch. In addition, the unused FFs available throughout SRDs are employed as history FFs to save the latest correct state of the system. Upon detection of any mismatch between the main and replica FFs, the system is able to roll back to the latest correct state stored in the history FFs. The simulation results extracted using fault injection experiments demonstrate that the proposed architecture provides both higher reliability and availability, as compared with the traditional triple modular redundancy techniques, while offering less area and power overheads.
Keywords
SRAM chips; failure analysis; fault tolerance; flip-flops; integrated circuit design; integrated circuit reliability; FF; HAFTA; MBU; SRAM-based reconfigurable devices; SRD; SRD-based designs; area overheads; fault injection experiments; highly available fault-tolerant architecture; logic blocks; mitigation techniques; multiple bit upsets; power overheads; reliability; replica flip-flops; safety-critical applications; soft errors; triple modular redundancy techniques; Availability; Circuit faults; Clocks; Computer architecture; Redundancy; Tunneling magnetoresistance; Availability; SRAM-based reconfigurable devices (SRDs); multiple bit upsets (MBUs); reliability;
fLanguage
English
Journal_Title
Device and Materials Reliability, IEEE Transactions on
Publisher
ieee
ISSN
1530-4388
Type
jour
DOI
10.1109/TDMR.2012.2229710
Filename
6359894
Link To Document