Title :
A high-performance and energy-efficient architecture for floating-point based LU decomposition on FPGAs
Author :
Govindu, Gokul ; Choi, Seonil ; Prasanna, Viktor ; Daga, Vikash ; Gangadharpalli, Sridhar ; Sridhar, V.
Author_Institution :
Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
Abstract :
Summary form only given. We first develop a novel architecture for fixed-point LU decomposition of streaming input matrices, on FPGAs. Our architecture, based on a circular linear array, achieves the minimal latency and is resource-efficient. We then extend it, by using a stacked matrices approach, to a floating-point based architecture, which achieves the minimal effective latency. Our design objective was to develop high-throughput and energy-efficient architectures for applications, which require computing LU decomposition. We analyze (1) the impact of high-throughput, pipelined floating-point units (with different depths of pipelining and different performance) on the architecture´s performance, and (2) the impact of algorithm level design on the system-wide energy dissipation. We analyze the energy dissipation by capturing algorithm and architectural details of the target FPGA device. We analyze and compare our architecture with a state-of-art architecture implemented on FPGAs with respect to latency, area and energy. Our designs achieve a 10%-60% reduction in energy over that of the state-of-art architecture.
Keywords :
field programmable gate arrays; floating point arithmetic; parallel architectures; pipeline processing; FPGA; circular linear array; energy-efficient architecture; field programmable gate array; fixed-point LU decomposition; high performance architecture; pipelined floating-point unit; stacked matrices approach; system-wide energy dissipation; Algorithm design and analysis; Computer applications; Computer architecture; Delay; Energy dissipation; Energy efficiency; Field programmable gate arrays; Matrix decomposition; Performance analysis; Pipeline processing;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International
Print_ISBN :
0-7695-2132-0
DOI :
10.1109/IPDPS.2004.1303134