Title :
A Reconfigurable Architecture for QR Decomposition Using a Hybrid Approach
Author :
Xinying Wang ; Jones, Philip ; Zambreno, Joseph
Author_Institution :
Dept. of Electr. & Comput. Eng., Iowa State Univ., Ames, IA, USA
Abstract :
QR decomposition has been widely used in many signal processing applications to solve linear inverse problems. However, QR decomposition is considered a computationally expensive process, and its sequential implementations fail to meet the requirements of many time-sensitive applications. The Householder transformation and the Givens rotation are the most popular techniques to conduct QR decomposition. Each of these approaches have their own strengths and weakness. The Householder transformation lends itself to efficient sequential implementation, however its inherent data dependencies complicate parallelization. On the other hand, the structure of Givens rotation provides many opportunities for concurrency, but is typically limited by the availability of computing resources. We propose a deeply pipelined reconfigurable architecture that can be dynamically configured to perform either approach in a manner that takes advantage of the strengths of each. At runtime, the input matrix is first partitioned into numerous sub-matrices. Our architecture then performs parallel Householder transformations on the sub-matrices in the same column block, which is followed by parallel Givens rotations to annihilate the remaining unneeded individual off-diagonals. Analysis of our design indicates the potential to achieve a performance of 10.5 GFLOPS with speedups of up to 1.46fiX, 1.15Xfi and 13.75fiX compared to the MKL implementation, a recent FPGA design and a Matlab solution, respectively.
Keywords :
matrix decomposition; reconfigurable architectures; transforms; vectors; Householder transformation; MKL implementation; QR decomposition; computing resources availability; inherent data dependencies; input matrix; linear inverse problems; many signal processing applications; parallel Givens rotations; reconfigurable architecture; sequential implementation; time-sensitive applications; Adders; Computer architecture; Field programmable gate arrays; MATLAB; Matrix decomposition; Parallel processing; Vectors; Architecture; FPGA; Givens rotation; Householder transformation; QR decomposition;
Conference_Titel :
VLSI (ISVLSI), 2014 IEEE Computer Society Annual Symposium on
Conference_Location :
Tampa, FL
Print_ISBN :
978-1-4799-3763-9
DOI :
10.1109/ISVLSI.2014.92