Title :
Parallel Computation of the Weather Research and Forecast (WRF) WDM5 Cloud Microphysics on a Many-Core GPU
Author :
Wang, Jun ; Huang, Bormin ; Huang, Allen ; Goldberg, Mitchell D.
Author_Institution :
Space Sci. & Eng. Center, Univ. of Wisconsin-Madison, Madison, WI, USA
Abstract :
The Weather Research and Forecast (WRF) Double Moment 5-class (WDM5) mixed ice microphysics scheme predicts mixing ratio of hydrometeors and their number concentrations for warm rain species including clouds and rain. WDM5 can be computed in parallel in the horizontal domain using a many-core GPU. In order to obtain better GPU performance, we manually rewrote the original WDM5 Fortran module into a highly parallel CUDA C program. The GPU-based implementation of WDM5 microphysics scheme on 1 GTX590 GPU achieves a significant speedup of 147× over its CPU-based single-threaded counterpart when we use asynchronous data transfer and non-coalesced memory access. More importantly, the speedup excluding the host-device data transfer time is 206× when using coalesced memory access. Since the WDM5 microphysics scheme is only an intermediate module of the entire WRF model, its input data should be already available in the GPU global memory from previous modules and its output data should reside at the GPU global memory for later usage by other modules.
Keywords :
clouds; geophysics computing; graphics processing units; multiprocessing systems; parallel architectures; physics computing; weather forecasting; CPU-based single-threaded counterpart; GPU global memory; GPU-based implementation; GTX590 GPU; WDM5 Fortran module; WDM5 microphysics scheme; WRF WDM5 cloud microphysics; Weather Research and Forecast; asynchronous data transfer; double moment 5-class; highly parallel CUDA C program; horizontal domain; hydrometeors; many-core GPU; mixed ice microphysics scheme; noncoalesced memory access; parallel computation; warm rain species; Clouds; Graphics processing unit; Ice; Instruction sets; Rain; Snow; Wavelength division multiplexing; CUDA; GPU; WDM5 microphysics scheme; WRF;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th International Conference on
Conference_Location :
Tainan
Print_ISBN :
978-1-4577-1875-5
DOI :
10.1109/ICPADS.2011.160