• DocumentCode
    166200
  • Title

    GPU-Qin: A methodology for evaluating the error resilience of GPGPU applications

  • Author

    Bo Fang ; Pattabiraman, Karthik ; Ripeanu, Matei ; Gurumurthi, Sudhanva

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of British Columbia, Vancouver, BC, Canada
  • fYear
    2014
  • fDate
    23-25 March 2014
  • Firstpage
    221
  • Lastpage
    230
  • Abstract
    While graphics processing units (GPUs) have gained wide adoption as accelerators for general-purpose applications (GPGPU), the end-to-end reliability implications of their use have not been quantified. Fault injection is a widely used method for evaluating the reliability of applications. However, building a fault injector for GPGPU applications is challenging due to their massive parallelism, which makes it difficult to achieve representativeness while being time-efficient. This paper makes three key contributions. First, it presents the design of a fault-injection methodology to evaluate end-to-end reliability properties of application kernels running on GPUs. Second, it introduces a fault-injection tool that uses real GPU hardware and offers a good balance between the representativeness and the efficiency of the fault injection experiments. Third, this paper characterizes the error resilience characteristics of twelve GPGPU applications.
  • Keywords
    fault tolerant computing; graphics processing units; GPGPU applications; GPU-Qin methodology; end-to-end reliability implications; error resilience evaluation; fault-injection methodology; general-purpose graphics processing units; Graphics processing units; Hardware; Instruction sets; Parallel processing; Registers; Resilience; Transient analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Performance Analysis of Systems and Software (ISPASS), 2014 IEEE International Symposium on
  • Conference_Location
    Monterey, CA
  • Print_ISBN
    978-1-4799-3604-5
  • Type

    conf

  • DOI
    10.1109/ISPASS.2014.6844486
  • Filename
    6844486