Title :
Resource-efficient acceleration of 2-dimensional Fast Fourier Transform computations on FPGAs
Author :
Kee, Hojin ; Bhattacharyya, Shuvra S. ; Petersen, Newton ; Kornerup, Jacob
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Maryland, College Park, MD, USA
fDate :
Aug. 30 2009-Sept. 2 2009
Abstract :
The 2-dimensional (2D) fast Fourier transform (FFT) is a fundamental, computationally intensive function that is of broad relevance to distributed smart camera systems. In this paper, we develop a systematic method for improving the throughput of 2D-FFT implementations on field-programmable gate arrays (FPGAs). Our method is based on a novel loop unrolling technique for FFT implementation, which is extended from our recent work on FPGA architectures for 1D-FFT implementation. This unrolling technique deploys multiple processing units within a single 1D-FFT core to achieve efficient configurations of data parallelism while minimizing memory space requirements, and FPGA slice consumption. Furthermore, using our techniques for parallel processing within individual 1DFFT cores, the number of input/output (I/O) ports within a given 1D-FFT core is limited to one input port and one output port. In contrast, previous 2D-FFT design approaches require multiple I/O pairs with multiple FFT cores. This streamlining of 1D-FFT interfaces makes it possible to avoid complex interconnection networks and associated scheduling logic for connecting multiple I/O ports from 1D-FFT cores to the I/O channel of external memory devices. Hence, our proposed unrolling technique maximizes the ratio of the achieved throughput to the consumed FPGA resources under pre-defined constraints on I/O channel bandwidth. To provide generality, our framework for 2D-FFT implementation can be efficiently parameterized in terms of key design parameters such as the transform size and I/O data word length.
Keywords :
fast Fourier transforms; field programmable gate arrays; parallel processing; 2D-FFT; FPGA; distributed smart camera system; fast Fourier transform; field-programmable gate array; loop unrolling technique; memory management; parallel processing; resource-efficient acceleration; Acceleration; Distributed computing; Fast Fourier transforms; Field programmable gate arrays; Joining processes; Logic devices; Multiprocessor interconnection networks; Parallel processing; Smart cameras; Throughput; 2-D Fast Fourier Transform; FPGA-based system design; High-level synthesis; Memory management;
Conference_Titel :
Distributed Smart Cameras, 2009. ICDSC 2009. Third ACM/IEEE International Conference on
Conference_Location :
Como
Print_ISBN :
978-1-4244-4620-9
Electronic_ISBN :
978-1-4244-4620-9
DOI :
10.1109/ICDSC.2009.5289356