DocumentCode :
682154
Title :
A hybrid GPU/CPU FFT library for large FFT problems
Author :
Shuo Chen ; Xiaoming Li
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Delaware, Newark, DE, USA
fYear :
2013
fDate :
6-8 Dec. 2013
Firstpage :
1
Lastpage :
10
Abstract :
Graphic Processing Units (GPU) has been proved to be a promising platform to accelerate large size Fast Fourier Transform (FFT) computation. However, GPU performance is severely restricted by the limited memory size and the low bandwidth of data transfer through PCI channel. Additionally, current GPU based FFT implementation only uses GPU to compute, but employs CPU as a mere memory-transfer controller. The computing power of CPUs is wasted. This paper proposes a hybrid parallel framework to use both multi-core CPU and GPU in heterogeneous systems to compute large-scale 2D and 3D FFTs that exceed GPU memory. This work introduces a flexible partitioning scheme that enables concurrent execution of CPU and GPU and integrates several FFT decomposition paradigms to tailor computation and communication. Moreover, our library exposes and exploits previously overlooked parallelism in FFT. Optimal load balancing is automatically achieved from effective performance modeling and empirical tuning process. On average, our large FFT library on GeForce GTX480, Tesla C2070, C2075 is 121% and 145% faster than 4-thread SSE-enabled FFTW and Intel MKL, with max speedups 4.61 and 2.81, respectively.
Keywords :
fast Fourier transforms; graphics processing units; peripheral interfaces; 3D FFT; CPU FFT library; CPU computing power; FFT computation; FFT decomposition paradigms; GPU memory; GPU performance; GeForce GTX480; Intel MKL; PCI channel; Tesla C2070; Tesla C2075; current GPU based FFT implementation; data transfer; empirical tuning process; flexible partitioning; graphic processing units; hybrid GPU; hybrid parallel framework; large FFT problems; large size fast Fourier transform; limited memory size; memory-transfer controller; multicore CPU; optimal load balancing; performance modeling; tailor computation; Graphics processing units;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Performance Computing and Communications Conference (IPCCC), 2013 IEEE 32nd International
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4799-3213-9
Type :
conf
DOI :
10.1109/PCCC.2013.6742796
Filename :
6742796
Link To Document :
بازگشت