Title :
Further Improvement on GPU-Based Parallel Implementation of WRF 5-Layer Thermal Diffusion Scheme
Author :
Melin Huang ; Bormin Huang ; Mielikainen, Jarno ; Huang, H. L. Allen ; Goldberg, Mitchell D. ; Mehta, A.
Author_Institution :
Space Sci. & Eng. Center, Univ. of Wisconsin-Madison, Madison, WI, USA
Abstract :
The Weather Research and Forecasting (WRF) model has been widely employed for weather prediction and atmospheric simulation with dual purposes in forecasting and research. Land-surface models (LSMs) are parts of the WRF model, which is used to provide information of heat and moisture fluxes over land and sea-ice points. The 5-layer thermal diffusion simulation is an LSM based on the MM5 soil temperature model with an energy budget made up of sensible, latent, and radiative heat fluxes. Owing to the feature of no interactions among horizontal grid points, the LSMs are very favorable for massively parallel processing. The study presented in this article demonstrates the parallel computing efforts on the WRF 5-layer thermal diffusion scheme using Graphics Processing Unit (GPU). Since this scheme is only one intermediate module of the entire WRF model, the involvement of the I/O transfer does not occur in the intermediate process. By employing one NVIDIA GTX 680 GPU in the case without I/O transfer, our optimization efforts on the GPU-based 5-layer thermal diffusion scheme can reach a speedup as high as 247.5x with respect to one CPU core, whereas the speedup for one CPU socket with respect to one CPU core is only 3.1x. We can even boost the speedup to 332x with respect to one CPU core when three GPUs are applied.
Keywords :
geophysics computing; graphics processing units; parallel processing; thermal diffusion; weather forecasting; 5-layer thermal diffusion simulation; CPU core; CPU socket; GPU-based parallel implementation; MM5 soil temperature model; NVIDIA GTX 680 GPU; WRF 5-layer thermal diffusion scheme; atmospheric simulation; graphics processing unit; land points; land-surface models; massively parallel processing; moisture fluxes; parallel computing; radiative heat fluxes; sea-ice points; weather prediction; weather research and forecasting model; Atmospheric modeling; Computational modeling; Graphics processing units; Instruction sets; Registers; Runtime; Weather forecasting; 5-layer thermal diffusion; Compute Unified Device Architecture (CUDA); Graphics Processing Unit (GPU); Weather Research and Forecasting (WRF); speedup;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2013 International Conference on
Conference_Location :
Seoul
DOI :
10.1109/ICPADS.2013.126