DocumentCode
3000391
Title
Implementation and Evaluation of Triple Precision BLAS Subroutines on GPUs
Author
Mukunoki, Daichi ; Takahashi, Daisuke
Author_Institution
Grad. Sch. of Syst. & Inf. Eng., Univ. of Tsukuba, Tsukuba, Japan
fYear
2012
fDate
21-25 May 2012
Firstpage
1378
Lastpage
1386
Abstract
We implemented and evaluated the triple precision Basic Linear Algebra Subprograms (BLAS) subroutines, AXPY, GEMV and GEMM on a Tesla C2050. In this paper, we present a Double Single (D+S) type triple precision floating-point value format and operations. They are based on techniques similar to Double-Double (DD) type quadruple precision operations. On the GPU, the D+S-type operations are more costly than the DD-type operations in theory and in practice. Therefore, the triple precision GEMM, which is a compute-bound operation, is slower than the quadruple precision GEMM. However, the triple precision AXPY and GEMV are memory-bound operations on the GPU, thus their execution time of these triple precision subroutines is close to 3/4 of the quadruple precision subroutines. Therefore, we conclude that the triple precision value format is useful for memory-bound operations, in cases where the quadruple precision is not required, but double precision is not sufficient.
Keywords
floating point arithmetic; graphics processing units; linear algebra; D+S-type operation; DD-type operation; GEMV; GPU; Tesla C2050; compute-bound operation; double single type triple precision floating-point value format; double-double type quadruple precision operation; memory-bound operation; quadruple precision GEMM; triple precision AXPY; triple precision BLAS subroutine; triple precision GEMM; triple precision basic linear algebra subprogram; triple precision value format; Algorithms; Arrays; Graphics processing unit; Instruction sets; Kernel; Layout; Libraries; BLAS; GPU; triple precision;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location
Shanghai
Print_ISBN
978-1-4673-0974-5
Type
conf
DOI
10.1109/IPDPSW.2012.175
Filename
6270805
Link To Document