• DocumentCode
    74137
  • Title

    HAFTA: Highly Available Fault-Tolerant Architecture to Protect SRAM-Based Reconfigurable Devices Against Multiple Bit Upsets

  • Author

    Ghaderi, Zana ; Miremadi, Seyed Ghassem ; Asadi, Hamed ; Fazeli, Mehdi

  • Author_Institution
    Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran, Iran
  • Volume
    13
  • Issue
    1
  • fYear
    2013
  • fDate
    Mar-13
  • Firstpage
    203
  • Lastpage
    212
  • Abstract
    Despite widespread use of SRAM-based reconfigurable devices (SRDs) in mainstream applications, their usage has been very limited in enterprise and safety-critical applications due to SRAM susceptibility to soft errors. Previous mitigation techniques to protect SRDs impose significant area and power overheads. Additionally, they suffer from susceptibility of configuration bits to multiple bit upsets (MBUs). In this paper, we present a highly available fault-tolerant architecture to protect SRD-based designs against MBUs in both configuration and user bits. In the proposed architecture, the entire design is duplicated with respect to the relative locations of logic blocks within the SRD and the main and replica flip-flops (FFs) are compared at each clock cycle to detect any possible mismatch. In addition, the unused FFs available throughout SRDs are employed as history FFs to save the latest correct state of the system. Upon detection of any mismatch between the main and replica FFs, the system is able to roll back to the latest correct state stored in the history FFs. The simulation results extracted using fault injection experiments demonstrate that the proposed architecture provides both higher reliability and availability, as compared with the traditional triple modular redundancy techniques, while offering less area and power overheads.
  • Keywords
    SRAM chips; failure analysis; fault tolerance; flip-flops; integrated circuit design; integrated circuit reliability; FF; HAFTA; MBU; SRAM-based reconfigurable devices; SRD; SRD-based designs; area overheads; fault injection experiments; highly available fault-tolerant architecture; logic blocks; mitigation techniques; multiple bit upsets; power overheads; reliability; replica flip-flops; safety-critical applications; soft errors; triple modular redundancy techniques; Availability; Circuit faults; Clocks; Computer architecture; Redundancy; Tunneling magnetoresistance; Availability; SRAM-based reconfigurable devices (SRDs); multiple bit upsets (MBUs); reliability;
  • fLanguage
    English
  • Journal_Title
    Device and Materials Reliability, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1530-4388
  • Type

    jour

  • DOI
    10.1109/TDMR.2012.2229710
  • Filename
    6359894