DocumentCode :
3470888
Title :
Efficient Implementation of the Overlap Operator on Multi-GPUs
Author :
Alexandru, Andrei ; Lujan, Mikel ; Pelissier, C. ; Gamari, B. ; Lee, Fred
Author_Institution :
Dept. of Phys., George Washington Univ., Washington, DC, USA
fYear :
2011
fDate :
19-21 July 2011
Firstpage :
123
Lastpage :
130
Abstract :
Lattice QCD calculations were one of the first applications to show the potential of GPUs in the area of high performance computing. Our interest is to find ways to effectively use GPUs for lattice calculations using the overlap operator. The large memory footprint of these codes requires the use of multiple GPUs in parallel. In this paper we show the methods we used to implement this operator efficiently. We run our codes both on a GPU cluster and a CPU cluster with similar interconnects. We find that to match performance the CPU cluster requires 20-30 times more CPU cores than GPUs.
Keywords :
computer graphic equipment; coprocessors; parallel processing; quantum chromodynamics; CPU cluster; GPU cluster; QCD calculation; high performance computing; lattice calculations; memory footprint; multiGPU; overlap operator; Approximation methods; Bandwidth; Graphics processing unit; Kernel; Lattices; Memory management; Polynomials; GPU; lattice QCD; overlap;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Application Accelerators in High-Performance Computing (SAAHPC), 2011 Symposium on
Conference_Location :
Knoxville, TN
Print_ISBN :
978-1-4577-0635-6
Electronic_ISBN :
978-0-7695-4448-9
Type :
conf
DOI :
10.1109/SAAHPC.2011.13
Filename :
6031575
Link To Document :
بازگشت