DocumentCode
2261698
Title
A strategy for soft error reduction in multi core designs
Author
Hyman, Ransford, Jr. ; Bhattacharya, Koustav ; Ranganathan, Nagarajan
Author_Institution
Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
fYear
2009
fDate
24-27 May 2009
Firstpage
2217
Lastpage
2220
Abstract
With the continuous decrease in the minimum feature size and increase in the chip density, modern processors are being increasingly susceptible to soft errors. In the past, the technique of lockstep execution with redundant threads on duplicated pipelines have been used for soft error rate reduction which can achieve high error coverage but at the cost of large overheads in terms of area and performance. In this paper, we propose techniques for protection against soft errors in multi-core designs using (i) the properties of spatial and temporal redundancy and (ii) value based detection. We utilize temporal redundancy by using the ldquolatency use slackrdquo (LSC) of an instruction, which we define as the number of cycles before the computed result from the instruction becomes the source operand of a subsequent instruction, while spatial redundancy is exploited by duplicating the instruction to a nearby idle processor core. Further, the value based detection technique is explored by exploiting the width of the operands with small data values and the generation of residue code check bits for the source operands. When a soft error is detected, error correction is achieved by rolling back the execution to a previous checkpoint state and re-executing the instructions. The proposed techniques have been implemented on the RSIM simulation framework and validated using the SPLASH benchmarks. Our results indicate that the soft error detection schemes proposed in this work, can be implemented, on average, with less than 10% increase in CPI on modern multi-core designs.
Keywords
error correction; microprocessor chips; radiation hardening (electronics); redundancy; RSIM simulation; SPLASH benchmarks; duplicated pipelines; error correction; latency use slack; multicore designs; processor core; soft error rate reduction; spatial redundancy; temporal redundancy; value based detection; Computer errors; Costs; Error analysis; Error correction; Logic; Multicore processing; Pipelines; Protection; Redundancy; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Circuits and Systems, 2009. ISCAS 2009. IEEE International Symposium on
Conference_Location
Taipei
Print_ISBN
978-1-4244-3827-3
Electronic_ISBN
978-1-4244-3828-0
Type
conf
DOI
10.1109/ISCAS.2009.5118238
Filename
5118238
Link To Document