مرکز منطقه ای اطلاع رساني علوم و فناوري - A Trip to Tahiti: Approaching a 5 TFlop SGEMM Using 3 AMD GPUs

DocumentCode :

3538032

Title :

A Trip to Tahiti: Approaching a 5 TFlop SGEMM Using 3 AMD GPUs

Author :

Weber, Rick ; Peterson, Gregory D.

Author_Institution :

Dept. of EECS, Univ. of Tennessee, Knoxville, TN, USA

fYear :

2012

fDate :

10-11 July 2012

Firstpage :

Lastpage :

Abstract :

Using GPUs as computational accelerators has been a growing area of research in the past several years. One particular area amenable to exploiting video card hardware is dense linear algebra. We continue this trend by generalizing the MAGMA xGEMM kernels, porting them to OpenCL and tuning them to run on the AMD 7970. Achieving up to 1.7 TFlops in SGEMM and 650 GFlops in DGEMM, we extend this performance to multiple GPUs using a parallel-for algorithm designed to run on multiple heterogeneous devices. Using 3 Radeon 7970s, our large GEMM algorithm obtains 4.37TFlops in single precision and 1.64 TFlops/s in double.

Keywords :

graphics processing units; matrix algebra; AMD 7970; AMD GPU; DGEMM; MAGMA xGEMM kernels; OpenCL; Radeon 7970; TFlop SGEMM; computational accelerators; dense linear algebra; double-precision general matrix multiplication; multiple heterogeneous devices; parallel-for algorithm; single-precision general matrix multiplication; video card hardware; Computer architecture; Graphics processing unit; Indexes; Kernel; Vectors; BLAS; GEMM; GPU; OpenCL; matrix multiply;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Application Accelerators in High Performance Computing (SAAHPC), 2012 Symposium on

Conference_Location :

Chicago IL

ISSN :

2166-5133

Print_ISBN :

978-1-4673-2882-1

Type :

conf

DOI :

10.1109/SAAHPC.2012.19

Filename :

6319187

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3538032