DocumentCode
3538014
Title
The Fat-Link Computation on Large GPU Clusters for Lattice QCD
Author
Shi, Guochun ; Babich, Ronald ; Clark, Michael A. ; Joó, Bálint ; Gottlieb, Steven ; Kindratenko, Volodymyr
Author_Institution
Nat. Center for Supercomput. Applic. (NCSA), Univ. of Illinois, Urbana, IL, USA
fYear
2012
fDate
10-11 July 2012
Firstpage
1
Lastpage
10
Abstract
Graphics Processing Units (GPU) are becoming increasingly popular in high performance computing due to their high performance, high power efficiency and low cost. In this paper, we present results of an effort to implement the fatlink computation - an important component of many lattice quantum chromo dynamics (LQCD) calculations - on GPU clusters using the QUDA framework. Two implementations, one similar to the original CPU algorithm in the MILC code and one based on the idea of reduced communication by redundant computations, are presented and their relative advantages are discussed. In strong-scaling tests on up to 384GPUs on Longhorn and 256 GPUs on Keene land GPU clusters, where the CPU core to GPU ratio is 4:1 in both clusters, we achieved up to 11.4x and 8.7x node speedup when running on the two GPU clusters, respectively.
Keywords
energy conservation; graphics processing units; parallel architectures; physics computing; power aware computing; quantum chromodynamics; CPU core; Keeneland GPU clusters; LQCD calculations; Longhorn GPU clusters; QUDA framework; fat-link computation; graphics processing units; high performance computing; large GPU clusters; lattice QCD; lattice quantum chromodynamics; power efficiency; Graphics processing unit; Indexes; Instruction sets; Kernel; Lattices; Layout; USA Councils; CUDA; GPU; Lattice QCD; MILC; QUDA; Quantum Chromodynamics;
fLanguage
English
Publisher
ieee
Conference_Titel
Application Accelerators in High Performance Computing (SAAHPC), 2012 Symposium on
Conference_Location
Chicago IL
ISSN
2166-5133
Print_ISBN
978-1-4673-2882-1
Type
conf
DOI
10.1109/SAAHPC.2012.10
Filename
6319185
Link To Document