DocumentCode :
3533592
Title :
FPGA accelerating three QR decomposition algorithms in the unified pipelined framework
Author :
Dou, Yong ; Zhou, Jie ; Chen, Xiaoyang ; Lei, Yuanwu ; Xu, Jinbo
Author_Institution :
Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha, China
fYear :
2009
fDate :
Aug. 31 2009-Sept. 2 2009
Firstpage :
410
Lastpage :
416
Abstract :
Many FPGA implementations for QR decomposition have been studied on small-scale matrix and all of them are presented individually. However to the best of our knowledge, there is no FPGA-based accelerator for large-scale QR decomposition. In this paper, we propose a unified FPGA accelerator structure for large-scale QR decomposition. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for QR decomposition. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 15 PEs can integrated into an Altera StratixII EP2S130F1020C5 on our self-designed board. Experimental results show that a factor of 4 speedup and the maximum powerperformance of 60.9 can be achieved compare to Pentium Dual CPU with double SSE thread.
Keywords :
field programmable gate arrays; logic design; matrix algebra; parallel algorithms; FPGA accelerator; large-scale QR decomposition; parallel algorithm; scalable linear array processing element; unified pipelined framework; Acceleration; Central Processing Unit; Concurrent computing; Field programmable gate arrays; Hardware; Large-scale systems; Parallel algorithms; Pipelines; Signal processing algorithms; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Field Programmable Logic and Applications, 2009. FPL 2009. International Conference on
Conference_Location :
Prague
ISSN :
1946-1488
Print_ISBN :
978-1-4244-3892-1
Electronic_ISBN :
1946-1488
Type :
conf
DOI :
10.1109/FPL.2009.5272252
Filename :
5272252
Link To Document :
بازگشت