Title :
Characterizing and enhancing global memory data coalescing on GPUs
Author :
Fauzia, Naznin ; Pouchet, Louis-Noel ; Sadayappan, P.
Author_Institution :
Ohio State Univ., Columbus, OH, USA
Abstract :
Effective parallel programming for GPUs requires careful attention to several factors, including ensuring coalesced access of data from global memory. There is a need for tools that can provide feedback to users about statements in a GPU kernel where non-coalesced data access occurs, and assistance in fixing the problem. In this paper, we address both these needs. We develop a two-stage framework where dynamic analysis is first used to detect and characterize uncoalesced accesses in arbitrary PTX programs. Transformations to optimize global memory access by introducing coalesced access are then implemented, using feedback from the dynamic analysis or using a model-driven approach. Experimental results demonstrate the use of the tools on a number of benchmarks from the Rodinia and Polybench suites.
Keywords :
graphics processing units; parallel programming; storage management; system monitoring; GPU; Polybench suites; Rodinia suites; arbitrary PTX programs; coalesced access; dynamic analysis; global memory data coalescing; model-driven approach; parallel programming; uncoalesced access characterization; uncoalesced access detection; Geometry; Graphics processing units; Instruction sets; Instruments; Kernel; Optimization; Performance analysis; GPU; PTX; coalescing; dynamic analysis; locality; polyhedral compilation; program transformation;
Conference_Titel :
Code Generation and Optimization (CGO), 2015 IEEE/ACM International Symposium on
Conference_Location :
San Francisco, CA
DOI :
10.1109/CGO.2015.7054183