DocumentCode :
3732333
Title :
Optimizing Complex Spatially-Variant Coefficient Stencils for Seismic Modeling on GPU
Author :
Jiarui Fang;Haohuan Fu;He Zhang;Wei Wu;Nanxun Dai;Lin Gan;Guangwen Yang
Author_Institution :
Minist. of Educ. Key Lab. for Earth Syst. Modeling, Tsinghua Univ., Beijing, China
fYear :
2015
Firstpage :
641
Lastpage :
648
Abstract :
The Explicit Time Evolution (ETE) method is an innovative Finite-Difference (FD) type method to simulate the wave propagation in acoustic media with higher spatial and temporal accuracy. However, different from FD, it is difficult to achieve an efficient GPU design because of the poor memory access patterns caused by the off-axis points and spatially-variant coefficients. In this paper, we present a set of new optimization strategies for ETE stencils according to the memory hierarchy of NVIDIA GPU. To handle the problem caused by the complexity of the stencil shapes, we design a one-to-multi updating scheme for shared memory usage. To alleviate the performance damage resulted from the poor memory access pattern of reading spatially-variant coefficients, we propose a stencil decomposition method to reduce un-coalesced global memory access. Based on the state-of-the-art GPU architecture, combining with existing spatial and temporal stencil blocking schemes, we manage to achieve 9.6x and 9.9x speedups compared with a well-tuned 12-core CPUs version for 37-point and 73-point ETE stencils, respectively. Compared with a well-tuned MIC version, the best speedups for the 2 type stencils are 3.7x and 4.7x. Our designs leads to an ETE method that is 31.2x faster than conventional CPU-FD method and make it a practical seismic imaging technology.
Keywords :
"Graphics processing units","Computer architecture","Computational modeling","Three-dimensional displays","Mathematical model","Microwave integrated circuits","Optimization"
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2015 IEEE 21st International Conference on
Electronic_ISBN :
1521-9097
Type :
conf
DOI :
10.1109/ICPADS.2015.86
Filename :
7384349
Link To Document :
بازگشت