DocumentCode :
3638973
Title :
Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics
Author :
Ronald Babich;Michael A. Clark;Balint Joó
Author_Institution :
Center for Comput. Sci., Boston Univ., Boston, MA, USA
fYear :
2010
Firstpage :
1
Lastpage :
11
Abstract :
Graphics Processing Units (GPUs) are having a transformational effect on numerical lattice quantum chromo- dynamics (LQCD) calculations of importance in nuclear and particle physics. The QUDA library provides a package of mixed precision sparse matrix linear solvers for LQCD applications, supporting single GPUs based on NVIDIA´s Compute Unified Device Architecture (CUDA). This library, interfaced to the QDP++/Chroma framework for LQCD calculations, is currently in production use on the "9g" cluster at the Jefferson Laboratory, enabling unprecedented price/performance for a range of problems in LQCD. Nevertheless, memory constraints on current GPU devices limit the problem sizes that can be tackled. In this contribution we describe the parallelization of the QUDA library onto multiple GPUs using MPI, including strategies for the overlapping of communication and computation. We report on both weak and strong scaling for up to 32 GPUs interconnected by InfiniBand, on which we sustain in excess of 4 Tflops.
Keywords :
"Instruction sets","Graphics processing unit","Lattices","Kernel","Libraries","Bandwidth","Image color analysis"
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SC), 2010 International Conference for
Print_ISBN :
978-1-4244-7557-5
Type :
conf
DOI :
10.1109/SC.2010.40
Filename :
5644900
Link To Document :
بازگشت