• DocumentCode
    812475
  • Title

    Improving Error Tolerance for Multithreaded Register Files

  • Author

    Wang, Lei ; Patel, Niral

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Connecticut, Storrs, CT
  • Volume
    16
  • Issue
    8
  • fYear
    2008
  • Firstpage
    1009
  • Lastpage
    1020
  • Abstract
    Chip multithreaded computing is exposed to the dual challenges of increasing system complexity and error sensitivity. It is critical to develop effective solutions that achieve better error tolerance without inducing performance degradation. In this paper, we propose a new error-tolerant memory design based on a unique computing phenomenon referred to as the dynamic multithreading redundancy (DMR). The proposed technique exploits the interplay between the concurrent threads for runtime error control. We also present two DMR enhancements, immediate write-back and self-recovery, to address the error accumulation effect. A multithreaded register file was implemented to demonstrate the proposed DMR technique. Simulation results on the SPEC CPU2000 benchmarks demonstrate significant overhead reduction in performance and energy efficiency related to error recovery. In addition, the proposed technique features good scalability with respect to the instruction-level and thread-level parallelism for next-generation processor design, where the soft error problem is expected to get worse due to technology scaling and architecture-affecting trends.
  • Keywords
    fault tolerant computing; integrated circuit design; integrated memory circuits; multi-threading; system recovery; SPEC CPU2000 benchmarks; chip multithreaded computing; concurrent threads; dynamic multithreading redundancy; error recovery; error sensitivity; error-tolerant memory design; instruction-level parallelism; multithreaded register files; next-generation processor design; overhead reduction; runtime error control; system complexity; thread-level parallelism; Degradation; Energy efficiency; Error correction; Multithreading; Parallel processing; Redundancy; Registers; Runtime; Scalability; Error tolerance; VLSI design; memory circuits; register files; soft errors;
  • fLanguage
    English
  • Journal_Title
    Very Large Scale Integration (VLSI) Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-8210
  • Type

    jour

  • DOI
    10.1109/TVLSI.2008.2000521
  • Filename
    4570470