DocumentCode
3706504
Title
Automatic Performance Tuning of Stencil Computations on GPUs
Author
Joseph D. Garvey;Tarek S. Abdelrahman
Author_Institution
Edward S. Rogers Sr. Dept. of Electr. &
fYear
2015
Firstpage
300
Lastpage
309
Abstract
We consider automatic performance tuning of stencil computations on Graphics Processing Units. We present a strategy that uses machine learning to determine the best way to use memory followed by a heuristic that divides the remaining optimizations into groups and exhaustively explores one group at a time. We evaluate our strategy using 102 synthetically generated OpenCL stencil kernels on an Nvidia GTX Titan GPU. We assess our strategy both in terms of the number of configurations explored during auto-tuning and the quality of the best configuration obtained. We explore two alternative heuristics that use different groupings of the optimizations. We show that, relative to a random sampling of the space and an expert search, our strategy achieves a reduction in the number of configurations explored of up to 80% and 84% respectively while also finding better performing configurations.
Keywords
"Optimization","Kernel","Merging","Yttrium","Graphics processing units","Parallel processing","Instruction sets"
Publisher
ieee
Conference_Titel
Parallel Processing (ICPP), 2015 44th International Conference on
ISSN
0190-3918
Type
conf
DOI
10.1109/ICPP.2015.39
Filename
7349585
Link To Document