DocumentCode :
2606475
Title :
Fixed Point Lanczos: Sustaining TFLOP-equivalent Performance in FPGAs for Scientific Computing
Author :
Jerez, Juan L. ; Constantinides, George A. ; Kerrigan, Eric C.
Author_Institution :
Dept. of Electr. & Electron. Eng., Imperial Coll. London, London, UK
fYear :
2012
fDate :
April 29 2012-May 1 2012
Firstpage :
53
Lastpage :
60
Abstract :
We consider the problem of enabling fixed-point implementations of linear algebra kernels to match the strengths of the field-programmable gate array (FPGA). Algorithms for solving linear equations, finding eigen values or finding singular values are typically nonlinear and recursive making the problem of establishing analytical bounds on variable dynamic range non-trivial. Current approaches fail to provide tight bounds for this type of algorithms. We use as a case study one of the most important kernels in scientific computing, the Lanczos iteration, which lies at the heart of well known methods such as conjugate gradient and minimum residual, and we show how we can modify the algorithm to allow us to apply standard linear algebra analysis to prove tight analytical bounds on all variables of the process, regardless of the properties of the original matrix. It is shown that the numerical behaviour of fixed-point implementations of the modified problem can be chosen to be at least as good as a double precision floating point implementation. Using this approach it is possible to get sustained FPGA performance very close to the peak general-purpose graphics processing unit (GPGPU) performance in FPGAs of comparable size when solving a single problem. If there are several independent problems to solve simultaneously it is possible to exceed the peak floating-point performance of a GPGPU, obtaining approximately 1, 2 or 4 TFLOPs for error tolerances of 10-7, 10-5 and 10-3, respectively, in a large Virtex 7 FPGA.
Keywords :
conjugate gradient methods; eigenvalues and eigenfunctions; field programmable gate arrays; fixed point arithmetic; graphics processing units; iterative methods; FPGA; GPGPU; Lanczos iteration; TFLOP-equivalent performance; conjugate gradient; double precision floating point; eigenvalues; error tolerance; field-programmable gate array; fixed point Lanczos; general-purpose graphics processing unit; linear algebra kernel; linear equation; minimum residual; peak floating-point; scientific computing; Accuracy; Dynamic range; Eigenvalues and eigenfunctions; Field programmable gate arrays; Kernel; Linear algebra; Symmetric matrices; Field programmable gate arrays; Fixed-point arithmetic; High performance computing; Iterative algorithms; Scientific computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Field-Programmable Custom Computing Machines (FCCM), 2012 IEEE 20th Annual International Symposium on
Conference_Location :
Toronto, ON
Print_ISBN :
978-1-4673-1605-7
Type :
conf
DOI :
10.1109/FCCM.2012.19
Filename :
6239791
Link To Document :
بازگشت