DocumentCode :
1961893
Title :
A co-design approach for fault-tolerant loop execution on Coarse-Grained Reconfigurable Arrays
Author :
Lari, Vahid ; Tanase, Alexandru ; Teich, Jurgen ; Witterauf, Michael ; Khosravi, Faramarz ; Hannig, Frank ; Meyer, Brett H.
Author_Institution :
Dept. of Comput. Sci., Friedrich-Alexander-Univ. Erlangen-Nurnberg (FAU), Erlangen, Germany
fYear :
2015
fDate :
15-18 June 2015
Firstpage :
1
Lastpage :
8
Abstract :
We present a co-design approach to establish redundancy schemes such as Dual Modular Redundancy (DMR) and Triple Modular Redundancy (TMR) to a whole region of a processor array for a class of Coarse-Grained Reconfigurable Arrays (CGRAs). The approach is applied to applications with mixed-criticality properties and experiencing varying Soft Error Rates (SERs) due to environmental reasons, e. g., changing altitude. The core idea is to adapt the degree of fault protection for loop programs executing in parallel on a CGRA to the level of reliability required as well as SER profiles. This is realized through claiming neighbor regions of processing elements for the execution of replicated loop nests. First, at the source code level, a compiler transformation is proposed that realizes these replication schemes in two steps: (1) replicate given parallel loop program two or three times for DMR or TMR, respectively, and (2) add appropriate error handling functions (voting or comparison) in order to detect respectively correct any single errors. Then, using the opportunities of hardware/software co-design, we propose optimized implementations of the error handling functions in software as well as in hardware. Finally, experimental results are given for the analysis of reliability gains for each proposed scheme of array replication in dependence of different SERs.
Keywords :
hardware-software codesign; parallel architectures; reconfigurable architectures; software fault tolerance; CGRA; DMR; SER; TMR; coarse-grained reconfigurable array; dual modular redundancy; error handling function; fault-tolerant loop execution; hardware-software codesign; mixed-criticality property; soft error rate; triple modular redundancy; Hardware; Parallel processing; Redundancy; Registers; Schedules; Software; Tunneling magnetoresistance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Adaptive Hardware and Systems (AHS), 2015 NASA/ESA Conference on
Conference_Location :
Montreal, QC
Type :
conf
DOI :
10.1109/AHS.2015.7231157
Filename :
7231157
Link To Document :
بازگشت