• DocumentCode
    3537807
  • Title

    An Instruction-Level Energy Estimation and Optimization Methodology for GPU

  • Author

    Wang, Yue ; Ranganathan, Nagarajan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
  • fYear
    2011
  • fDate
    Aug. 31 2011-Sept. 2 2011
  • Firstpage
    621
  • Lastpage
    628
  • Abstract
    Nowadays, GPU architecture is commonly exploited in various researches on computer graphic and other scientific computing areas. Parallel computing feature of GPU provides performance benefits for execution of many programs. However, as the parallel degree keeps extending, the number of active cores in GPU required for execution is also increasing. Therefore the rising of energy consumption caused by using large number of cores begins to draw attention. Previous research [1] reveals that given a multicore program, the curve of energy consumption first falls and then rises, as the number of active cores increases. That means we can have the minimum energy consumption if the number of active cores is properly configured. In this paper, we develop an instruction-level prediction mechanism to estimate the energy consumption of a given program under different numbers of cores. The prediction is based on the profile of Parallel Thread Execution (PTX) [2] codes generated during compilation of the original program. With the help of this mechanism, the energy-optimal number of cores can be found during compilation and used in execution, replacing the one given by programmer. Tests have been carried on several NVIDIA CUDA [10] benchmarks. The results show that the energy consumption is minimized without losing much performance. With the predicted energy-optimal number of active cores, we show that the energy consumption saving for the selected benchmarks is from 7.31% to 11.76% on average, with a worst case of performance lost 4.92%.
  • Keywords
    computer graphic equipment; coprocessors; parallel architectures; GPU architecture; NVIDIA CUDA benchmarks; instruction-level energy estimation; instruction-level prediction mechanism; minimum energy consumption; optimization methodology; parallel computing; parallel thread execution codes; Benchmark testing; Energy consumption; Energy measurement; Graphics processing unit; Instruction sets; Multicore processing; Registers; GPU; energy; energy estimation; energy optimization; instruction-level; multicore;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Technology (CIT), 2011 IEEE 11th International Conference on
  • Conference_Location
    Pafos
  • Print_ISBN
    978-1-4577-0383-6
  • Electronic_ISBN
    978-0-7695-4388-8
  • Type

    conf

  • DOI
    10.1109/CIT.2011.69
  • Filename
    6036835