Title :
Aging-aware compiler-directed VLIW assignment for GPGPU architectures
Author :
Rahimi, Azar ; Benini, Luca ; Gupta, R.K.
Author_Institution :
CSE, UC San Diego, La Jolla, CA, USA
fDate :
May 29 2013-June 7 2013
Abstract :
Negative bias temperature instability (NBTI) adversely affects the reliability of a processor by introducing new delay-induced faults. However, the effect of these delay variations is not uniformly spread across functional units and instructions: some are affected more (hence less reliable) than others. This paper proposes a NBTI-aware compiler-directed very long instruction word (VLIW) assignment scheme that uniformly distributes the stress of instructions with the aim of minimizing aging of GPGPU architecture without any performance penalty. The proposed solution is an entirely software technique based on static workload characterization and online execution with NBTI monitoring that equalizes the expected lifetime of each processing element by regenerating aging-aware healthy kernels that respond to the specific health state of GPGPU. We demonstrate our approach on AMD Evergreen architecture where iso-throughput executions of the healthy kernels reduce NBTI-induced voltage threshold shift up to 49% (11%) compared to naïve kernel executions, with (without) architectural support for power-gating. The kernel adaption flow takes average of 13 millisecond on a typical host machine thus making it suitable for practical implementation.
Keywords :
graphics processing units; instruction sets; negative bias temperature instability; operating system kernels; parallel architectures; program compilers; AMD Evergreen architecture; GPGPU architectures; NBTI monitoring; NBTI-aware compiler-directed very long instruction word assignment scheme; NBTI-induced voltage threshold shift reduction; aging minimization; aging-aware compiler-directed VLIW assignment; aging-aware healthy kernel regeneration; delay variation effect; delay-induced faults; functional units; general purpose graphical processing units; host machine; kernel adaption flow; naïve kernel executions; negative bias temperature instability; online execution; power-gating; processor reliability; software technique; static workload characterization; Aging; Computer architecture; Degradation; Kernel; Sensors; Stress; VLIW; Adaptive Kernel; Aging-aware Compilation; Dynamic Binary Optimizer; GPGPU; NBTI; VLIW;
Conference_Titel :
Design Automation Conference (DAC), 2013 50th ACM/EDAC/IEEE
Conference_Location :
Austin, TX