• DocumentCode
    619465
  • Title

    Aging-aware compiler-directed VLIW assignment for GPGPU architectures

  • Author

    Rahimi, Azar ; Benini, Luca ; Gupta, R.K.

  • Author_Institution
    CSE, UC San Diego, La Jolla, CA, USA
  • fYear
    2013
  • fDate
    May 29 2013-June 7 2013
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Negative bias temperature instability (NBTI) adversely affects the reliability of a processor by introducing new delay-induced faults. However, the effect of these delay variations is not uniformly spread across functional units and instructions: some are affected more (hence less reliable) than others. This paper proposes a NBTI-aware compiler-directed very long instruction word (VLIW) assignment scheme that uniformly distributes the stress of instructions with the aim of minimizing aging of GPGPU architecture without any performance penalty. The proposed solution is an entirely software technique based on static workload characterization and online execution with NBTI monitoring that equalizes the expected lifetime of each processing element by regenerating aging-aware healthy kernels that respond to the specific health state of GPGPU. We demonstrate our approach on AMD Evergreen architecture where iso-throughput executions of the healthy kernels reduce NBTI-induced voltage threshold shift up to 49% (11%) compared to naïve kernel executions, with (without) architectural support for power-gating. The kernel adaption flow takes average of 13 millisecond on a typical host machine thus making it suitable for practical implementation.
  • Keywords
    graphics processing units; instruction sets; negative bias temperature instability; operating system kernels; parallel architectures; program compilers; AMD Evergreen architecture; GPGPU architectures; NBTI monitoring; NBTI-aware compiler-directed very long instruction word assignment scheme; NBTI-induced voltage threshold shift reduction; aging minimization; aging-aware compiler-directed VLIW assignment; aging-aware healthy kernel regeneration; delay variation effect; delay-induced faults; functional units; general purpose graphical processing units; host machine; kernel adaption flow; naïve kernel executions; negative bias temperature instability; online execution; power-gating; processor reliability; software technique; static workload characterization; Aging; Computer architecture; Degradation; Kernel; Sensors; Stress; VLIW; Adaptive Kernel; Aging-aware Compilation; Dynamic Binary Optimizer; GPGPU; NBTI; VLIW;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design Automation Conference (DAC), 2013 50th ACM/EDAC/IEEE
  • Conference_Location
    Austin, TX
  • ISSN
    0738-100X
  • Type

    conf

  • Filename
    6560609