DocumentCode
3045391
Title
Analysis of high-performance floating-point arithmetic on FPGAs
Author
Govindu, Gokul ; Zhuo, Ling ; Choi, Seonil ; Prasanna, Viktor
Author_Institution
Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
fYear
2004
fDate
26-30 April 2004
Firstpage
149
Abstract
Summary form only given. FPGAs are increasingly being used in the high performance and scientific computing community to implement floating-point based hardware accelerators. We analyze the floating-point multiplier and adder/subtractor units by considering the number of pipeline stages of the units as a parameter and use throughput/area as the metric. We achieve throughput rates of more than 240 Mhz (200 Mhz) for single (double) precision operations by deeply pipelining the units. To illustrate the impact of the floating-point units on a kernel, we implement a matrix multiplication kernel based on our floating-point units and show that a state-of-the-art FPGA device is capable of achieving about 15 GFLOPS (8 GFLOPS) for the single (double) precision floating-point based matrix multiplication. We also show that FPGAs are capable of achieving up to 6x improvement (for single precision) in terms of the GFLOPS/W (performance per unit power) metric over that of general purpose processors. We then discuss the impact of floating-point units on the design of an energy efficient architecture for the matrix multiply kernel.
Keywords
field programmable gate arrays; floating point arithmetic; matrix multiplication; parallel architectures; pipeline arithmetic; FPGA; adder-subtractor unit; energy efficient architecture; field programmable gate array; floating-point multiplier; hardware accelerator; high-performance floating arithmetic; matrix multiplication kernel; pipeline stages; scientific computing; Delay; Energy efficiency; Field programmable gate arrays; Floating-point arithmetic; Frequency; Kernel; Pipeline processing; Scientific computing; Signal processing algorithms; Throughput;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International
Print_ISBN
0-7695-2132-0
Type
conf
DOI
10.1109/IPDPS.2004.1303135
Filename
1303135
Link To Document