DocumentCode
62929
Title
Enabling Portable Optimizations of Data Placement on GPU
Author
Guoyang Chen ; Bo Wu ; Dong Li ; Xipeng Shen
Volume
35
Issue
4
fYear
2015
fDate
July-Aug. 2015
Firstpage
16
Lastpage
24
Abstract
Modern GPU memory systems manifest more varieties, increasing complexities, and rapid changes. Different placements of data on memory systems often cause significant differences in program performance. Most current GPU programming systems rely on programmers to indicate the appropriate placements, but finding the appropriate placements is difficult for programmers in practice owing to the complexity and fast changes of memory systems, as well as the input sensitivity of appropriate data placements--that is, the best placements often differ when a program runs on a different input data set. This article introduces a software framework called Porple. It offers a solution that, for the first time, makes it possible to automatically enhance data placement across a GPU. Through Porple, a GPU program´s data gets placed appropriately on memory on the fly, customized to the current input dataset. Moreover, when new memory systems arrive, it can easily adapt the placements accordingly. Experiments on three types of GPU systems show that Porple consistently finds optimal or near-optimal placement, yielding up to 2.93 times (1.75 times average on three generations of GPU) speedups compared to programmers´ decisions.
Keywords
computational complexity; data handling; graphics processing units; storage management; GPU programming system; Porple software framework; complexity; modern GPU memory systems; near-optimal placement; portable data placement optimization; program performance; programmer decision; Benchmark testing; Complexity theory; Computer programs; Graphics processing units; Memory; Runtime; GPU; cache; compiler; data placement; hardware specification language;
fLanguage
English
Journal_Title
Micro, IEEE
Publisher
ieee
ISSN
0272-1732
Type
jour
DOI
10.1109/MM.2015.53
Filename
7106396
Link To Document