Title :
Program Optimization of Stencil Based Application on the GPU-Accelerated System
Author :
Wang, Guibin ; Yang, Xuejun ; Zhang, Ying ; Tang, Tao ; Fang, Xudong
Author_Institution :
Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
Graphic Processing Unit (GPU), with many light-weight data-parallel cores, can provide substantial parallel computational power to accelerate general purpose applications. But the powerful computing capacity could not be fully utilized for memory-intensive applications, which are limited by off-chip memory bandwidth and latency. Stencil computation has abundant parallelism and low computational intensity which make it a useful architectural evaluation benchmark. In this paper, we propose some memory optimizations for a stencil based application mgrid from SPEC 2 K benchmarks. Through exploiting data locality in 3-level memory hierarchies and tuning the thread granularity, we reduce the pressure on the off-chip memory bandwidth. To hide the long off-chip memory access latency, we further prefetch data during computation through double-buffer. In order to fully exploit the CPU-GPU heterogeneous system, we redistribute the computation between these two computing resource. Through all these optimizations, we gain 24.2 x speedup compared to the simple mapping version, and get as high as 34.3 x speedup when compared with a CPU implementation.
Keywords :
benchmark testing; coprocessors; data handling; distributed processing; memory architecture; network operating systems; software engineering; storage management; GPU-accelerated System; SPEC 2K benchmarks; data prefetching; graphic processing unit; heterogeneous system; memory optimization; stencil based application; Acceleration; Application software; Bandwidth; Concurrent computing; Delay; Distributed computing; Distributed processing; Parallel processing; Prefetching; Yarn; CUDA; GPGPU; heterogeneous system; mgrid; stencil;
Conference_Titel :
Parallel and Distributed Processing with Applications, 2009 IEEE International Symposium on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3747-4
DOI :
10.1109/ISPA.2009.70