Author :
Li, Xiaojun ; Gao, Yang ; Liu, Ying
Author_Institution :
Sch. Of Inf. Sci. & Eng., Grad. Univ. of Chinese Acad. of Sci., Beijing, China
Abstract :
Heterogeneous platforms, integrating SMPs, clusters, GPUs, FPGAs, etc. are becoming the most popular architectures of supercomputers. Achieving high performance on CPUs or GPUs requires careful consideration of their different architectures, which challenges the capability and skills of programmers. In order to overcome the portability problem, OpenCL, a free cross-platform programming standard, is proposed by Khronos Compute Working Group. However, the performance of OpenCL-based programs has not been thoroughly studied yet. Therefore, in this paper, we first design OpenFFT-Bench, an FFT application with OpenCL-based FFT and OpenGL-based real-time spectrum visualization as the benchmark. We evaluate its performance on four OpenCL programming platforms including NVIDIA CUDA, ATI Stream (GPU), ATI Stream (CPU), and Intel OpenCL. Characteristics of OpenFFT-Bench are investigated with multiple FFT sizes. Experimental results show that OpenCL and OpenGL-based applications can not only run on multiple heterogeneous platforms, but also achieve relatively high performance on GPU-based platforms.
Keywords :
coprocessors; fast Fourier transforms; parallel architectures; parallel machines; ATI Stream; CPU; FFT application; GPU-based platform; Intel OpenCL. characteristics; NVIDIA CUDA; OpenCL programming; OpenCL-based FFT; OpenFFT-bench; architecture; fast Fourier transform; heterogeneous platform; portability problem; real-time spectrum visualization; supercomputer; Benchmark testing; Computer architecture; Computers; Discrete Fourier transforms; Graphics processing unit; Parallel processing; CUDA; OpenCL; OpenFFT-Bench; heterogeneous platforms;