DocumentCode :
560183
Title :
Multi-science applications with single codebase — GAMER — For massively parallel architectures
Author :
Shukla, Hemant ; Schive, Hsi-Yu ; Woo, Tak-Pong ; Chiueh, Tzihong
Author_Institution :
Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
fYear :
2011
fDate :
12-18 Nov. 2011
Firstpage :
1
Lastpage :
11
Abstract :
The growing need for power efficient extreme-scale high-performance computing (HPC) coupled with plateauing clock-speeds is driving the emergence of massively parallel compute architectures. Tens to many hundreds of cores are increasingly made available as compute units, either as the integral part of the main processor or as coprocessors designed for handling massively parallel workloads. In the case of many-core graphics processing units (GPUs) hundreds of SIMD cores primarily designed for image and video rendering are used for high-performance scientific computations. The new architectures typically offer ANSI standard programming models such as CUDA (NVIDIA) and OpenCL. However, the wide-ranging adoption of these parallel architectures is steeped in difficult learning curve and requires reengineering of existing applications that mostly leads to expensive and error prone code rewrites without prior guarantee and knowledge of any speedups. Broad range of complex scientific applications across many domains use common algorithms and techniques, such as adaptive mesh refinements (AMR), advanced hydrodynamics partial differential equation (PDE) solvers, Poisson-Gravity solvers etc, that have demonstrably performed highly efficiently on GPU based systems. Taking advantage of the commonalities, we use GPU-aware AMR code, GAMER [1], to examine the unique approach of solving multi-science problems in astrophysics, hydrodynamics and particle physics with single codebase. We demonstrate significant speedups in disparate class of scientific applications on 3 separate clusters, viz., Dirac, Laohu and Mole 8.5. By extensively reusing the extendable single codebase we mitigate the impediments of significant code rewrites. We also collect performance and energy consumption benchmark metrics on 50-nodes NVIDIA C2050 GPU and Intel 8-core Nehalem CPU on Dirac cluster at the National Energy Research Supercomputing Center (NERSC). In addition, we propose a strategy and framework fo- legacy and new applications to successfully leverage the evolving GAMER codebase on massively parallel architectures. The framework and the benchmarks are aimed to help quantify the adoption strategies for legacy and new scientific applications.
Keywords :
ANSI standards; energy consumption; graphics processing units; mainframes; parallel architectures; parallel machines; power aware computing; rendering (computer graphics); 50-nodes NVIDIA C2050 GPU; ANSI standard programming models; GAMER codebase; GPU based systems; GPU-aware AMR code; Intel 8-core Nehalem CPU; National Energy Research Supercomputing Center; OpenCL; Poisson-Gravity solvers; SIMD cores; adaptive mesh refinements; advanced hydrodynamics partial differential equation solvers; clock-speeds; energy consumption benchmark metrics; error prone code rewriting; image rendering; many-core graphics processing units; massively parallel compute architectures; multiscience applications; parallel workload handling; power efficient extreme-scale high performance computing; video rendering; Computational modeling; Graphics processing unit; Instruction sets; Kernel; Mathematical model; Memory management; AMR; GPU; Poisson-Gravity solvers; benchmarks; hydrodynamics; simulations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for
Conference_Location :
Seatle, WA
Electronic_ISBN :
978-1-4503-0771-0
Type :
conf
Filename :
6114450
Link To Document :
بازگشت