DocumentCode :
3109016
Title :
High performance discrete Fourier transforms on graphics processors
Author :
Govindaraju, Naga K. ; Lloyd, Brandon ; Dotsenko, Yuri ; Smith, Burton ; Manferdelli, John
fYear :
2008
fDate :
15-21 Nov. 2008
Firstpage :
1
Lastpage :
12
Abstract :
We present novel algorithms for computing discrete Fourier transforms with high performance on GPUs. We present hierarchical, mixed radix FFT algorithms for both power-of-two and non-power-of-two sizes. Our hierarchical FFT algorithms efficiently exploit shared memory on GPUs using a Stockham formulation. We reduce the memory transpose overheads in hierarchical algorithms by combining the transposes into a block-based multi-FFT algorithm. For non-power-of-two sizes, we use a combination of mixed radix FFTs of small primes and Bluestein´s algorithm. We use modular arithmetic in Bluestein´s algorithm to improve the accuracy. We implemented our algorithms using the NVIDIA CUDA API and compared their performance with NVIDIA´s CUFFT library and an optimized CPU-implementation (Intel´s MKL) on a high-end quad-core CPU. On an NVIDIA GPU, we obtained performance of up to 300 GFlops, with typical performance improvements of 2-4times over CUFFT and 8-40times improvement over MKL for large sizes.
Keywords :
application program interfaces; computer graphic equipment; discrete Fourier transforms; mathematics computing; parallel algorithms; shared memory systems; Bluestein algorithm; CUFFT; Intel MKL; NVIDIA CUDA API; NVIDIA CUFFT library; NVIDIA GPU; Stockham formulation; discrete Fourier transform; graphics processor; hierarchical mixed-radix block-based multiFFT algorithm; high-end quadcore CPU; high-performance computing; memory transpose overhead; modular arithmetic; optimized CPU-implementation; shared memory system; small prime number; Arithmetic; Books; Central Processing Unit; Discrete Fourier transforms; Flexible printed circuits; Graphics; Hardware; High performance computing; Libraries; Signal processing algorithms;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008. International Conference for
Conference_Location :
Austin, TX
Print_ISBN :
978-1-4244-2834-2
Electronic_ISBN :
978-1-4244-2835-9
Type :
conf
DOI :
10.1109/SC.2008.5213922
Filename :
5213922
Link To Document :
بازگشت