DocumentCode
26377
Title
A Low-Cost Mechanism Exploiting Narrow-Width Values for Tolerating Hard Faults in ALU
Author
Seokin Hong ; Soontae Kim
Author_Institution
Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
Volume
64
Issue
9
fYear
2015
fDate
Sept. 1 2015
Firstpage
2433
Lastpage
2446
Abstract
Digital circuits are expected to increasingly suffer from more hard faults due to technology scaling. Especially, a single hard fault in ALU (Arithmetic Logic Unit) might lead to a total failure in processors or significantly reduce their performance. To address these increasingly important problems, we propose a novel cost-efficient fault-tolerant mechanism for the ALU, called LIZARD. LIZARD employs two half-word ALUs, instead of a single full-word ALU, to perform computations with concurrent fault detection. When a fault is detected, the two ALUs are partitioned into four quarter-word ALUs. After diagnosing and isolating a faulty quarter-word ALU, LIZARD continues its operation using the remaining ones, which can detect and isolate another fault. Even though LIZARD uses narrow ALUs for computations, it adds negligible performance overhead through exploiting predictability of the results in the arithmetic computations. We also present the architectural modifications when employing LIZARD for scalar as well as superscalar processors. Through comparative evaluation, we demonstrate that LIZARD outperforms other competitive fault-tolerant mechanisms in terms of area, energy consumption, performance and reliability.
Keywords
digital arithmetic; digital circuits; failure analysis; fault tolerance; logic circuits; ALU; LIZARD; arithmetic logic unit; digital circuits; failure; fault tolerant mechanism; hard faults; low-cost mechanism; narrow-width values; processors; Adders; Circuit faults; Fault detection; Fault diagnosis; Fault tolerance; Fault tolerant systems; Program processors; Arithmetic and logic units; Fault tolerance; Hard Faults; Hard faults; Processor architectures; arithmetic and logic units; fault tolerance; processor architectures;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/TC.2014.2366743
Filename
6945834
Link To Document