DocumentCode :
2136632
Title :
Exploiting coarse-grain verification parallelism for power-efficient fault tolerance
Author :
Rashid, M. Wasiur ; Tan, Edwin J. ; Huang, Michael C. ; Albonesi, David H.
Author_Institution :
Dept. of Electr. & Comput. Eng., Rochester Univ., NY, USA
fYear :
2005
fDate :
17-21 Sept. 2005
Firstpage :
315
Lastpage :
325
Abstract :
As device dimensions continue to be aggressively scaled, microprocessors are becoming increasingly vulnerable to the impact of undesired energy, such as that of a cosmic particle strike, which can cause transient errors. To prevent operational failure due to these errors, system-level techniques such as redundant execution will be increasingly required for fault detection and tolerance in future processors. However, the need for redundancy is directly opposed to the growing need for more power efficient operation. Conventional techniques that use multi-core microarchitectures to provide whole-thread duplication generally incur significant energy overhead which can exacerbate the already severe problem of power consumption and heat dissipation given a certain throughput requirement. In the future, approaches that supply the necessary level of robustness at a given throughput level must also be power-aware. We propose a thread-level redundant execution microarchitecture that significantly reduces the energy overhead of replication without unduly impacting performance. Our approach exploits the fact that with appropriate hardware support, the verification operation can be parallelized and run on a chip multiprocessor with support for frequency scaling together with supply voltage scaling and/or body biasing. To further improve the efficiency of verification, we exploit the information obtained by the leading thread to assist the trailing verification threads. We discuss in detail the required architectural support and show that our approach can be highly energy-efficient: using two checkers, fully replicated execution costs only an average 28% extra energy over non-redundant execution with virtually no performance loss.
Keywords :
fault tolerant computing; formal verification; multiprocessing systems; body biasing; chip multiprocessor; coarse-grain verification parallelism; fault detection; frequency scaling; microprocessors; multicore microarchitectures; operational failure; power-efficient fault tolerance; supply voltage scaling; thread-level redundant execution microarchitecture; transient errors; Energy consumption; Fault detection; Fault tolerance; Hardware; Microarchitecture; Microprocessors; Redundancy; Robustness; Throughput; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Architectures and Compilation Techniques, 2005. PACT 2005. 14th International Conference on
ISSN :
1089-795X
Print_ISBN :
0-7695-2429-X
Type :
conf
DOI :
10.1109/PACT.2005.20
Filename :
1515603
Link To Document :
بازگشت