• DocumentCode
    2261698
  • Title

    A strategy for soft error reduction in multi core designs

  • Author

    Hyman, Ransford, Jr. ; Bhattacharya, Koustav ; Ranganathan, Nagarajan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
  • fYear
    2009
  • fDate
    24-27 May 2009
  • Firstpage
    2217
  • Lastpage
    2220
  • Abstract
    With the continuous decrease in the minimum feature size and increase in the chip density, modern processors are being increasingly susceptible to soft errors. In the past, the technique of lockstep execution with redundant threads on duplicated pipelines have been used for soft error rate reduction which can achieve high error coverage but at the cost of large overheads in terms of area and performance. In this paper, we propose techniques for protection against soft errors in multi-core designs using (i) the properties of spatial and temporal redundancy and (ii) value based detection. We utilize temporal redundancy by using the ldquolatency use slackrdquo (LSC) of an instruction, which we define as the number of cycles before the computed result from the instruction becomes the source operand of a subsequent instruction, while spatial redundancy is exploited by duplicating the instruction to a nearby idle processor core. Further, the value based detection technique is explored by exploiting the width of the operands with small data values and the generation of residue code check bits for the source operands. When a soft error is detected, error correction is achieved by rolling back the execution to a previous checkpoint state and re-executing the instructions. The proposed techniques have been implemented on the RSIM simulation framework and validated using the SPLASH benchmarks. Our results indicate that the soft error detection schemes proposed in this work, can be implemented, on average, with less than 10% increase in CPI on modern multi-core designs.
  • Keywords
    error correction; microprocessor chips; radiation hardening (electronics); redundancy; RSIM simulation; SPLASH benchmarks; duplicated pipelines; error correction; latency use slack; multicore designs; processor core; soft error rate reduction; spatial redundancy; temporal redundancy; value based detection; Computer errors; Costs; Error analysis; Error correction; Logic; Multicore processing; Pipelines; Protection; Redundancy; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Circuits and Systems, 2009. ISCAS 2009. IEEE International Symposium on
  • Conference_Location
    Taipei
  • Print_ISBN
    978-1-4244-3827-3
  • Electronic_ISBN
    978-1-4244-3828-0
  • Type

    conf

  • DOI
    10.1109/ISCAS.2009.5118238
  • Filename
    5118238