• DocumentCode
    129504
  • Title

    GPGPUs: How to combine high computational power with high reliability

  • Author

    Bautista Gomez, L. ; Cappello, Franck ; Carro, Luigi ; DeBardeleben, Nathan ; Fang, B. ; Gurumurthi, Sudhanva ; Pattabiraman, Karthik ; Rech, P. ; Sonza Reorda, M.

  • Author_Institution
    Argonne Nat. Lab., Argonne, IL, USA
  • fYear
    2014
  • fDate
    24-28 March 2014
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    GPGPUs are used increasingly in several domains, from gaming to different kinds of computationally intensive applications. In many applications GPGPU reliability is becoming a serious issue, and several research activities are focusing on its evaluation. This paper offers an overview of some major results in the area. First, it shows and analyzes the results of some experiments assessing GPGPU reliability in HPC datacenters. Second, it provides some recent results derived from radiation experiments about the reliability of GPGPUs. Third, it describes the characteristics of an advanced fault-injection environment, allowing effective evaluation of the resiliency of applications running on GPGPUs.
  • Keywords
    circuit reliability; computer centres; graphics processing units; GPGPU; HPC data center; advanced fault-injection environment; computational power; general-purpose graphics processing unit; radiation experiment; reliability; Benchmark testing; Graphics processing units; Instruction sets; Laboratories; Neutrons; Performance evaluation; Reliability; GPGPUs; HPC; fault injection; radiation; reliability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014
  • Conference_Location
    Dresden
  • Type

    conf

  • DOI
    10.7873/DATE.2014.354
  • Filename
    6800555