• DocumentCode
    3706504
  • Title

    Automatic Performance Tuning of Stencil Computations on GPUs

  • Author

    Joseph D. Garvey;Tarek S. Abdelrahman

  • Author_Institution
    Edward S. Rogers Sr. Dept. of Electr. &
  • fYear
    2015
  • Firstpage
    300
  • Lastpage
    309
  • Abstract
    We consider automatic performance tuning of stencil computations on Graphics Processing Units. We present a strategy that uses machine learning to determine the best way to use memory followed by a heuristic that divides the remaining optimizations into groups and exhaustively explores one group at a time. We evaluate our strategy using 102 synthetically generated OpenCL stencil kernels on an Nvidia GTX Titan GPU. We assess our strategy both in terms of the number of configurations explored during auto-tuning and the quality of the best configuration obtained. We explore two alternative heuristics that use different groupings of the optimizations. We show that, relative to a random sampling of the space and an expert search, our strategy achieves a reduction in the number of configurations explored of up to 80% and 84% respectively while also finding better performing configurations.
  • Keywords
    "Optimization","Kernel","Merging","Yttrium","Graphics processing units","Parallel processing","Instruction sets"
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing (ICPP), 2015 44th International Conference on
  • ISSN
    0190-3918
  • Type

    conf

  • DOI
    10.1109/ICPP.2015.39
  • Filename
    7349585