Title :
An Implementation of Parallel 1-D FFT on the K Computer
Author :
Takahashi, Daisuke ; Uno, Atsuya ; Yokokawa, Mitsuo
Author_Institution :
Fac. of Eng., Inf. & Syst., Univ. of Tsukuba, Tsukuba, Japan
Abstract :
In this paper, we propose an implementation of a parallel one-dimensional fast Fourier transform (FFT) on the K computer. The proposed algorithm is based on the six-step FFT algorithm, which can be altered into the recursive six-step FFT algorithm to reduce the number of cache misses. The recursive six-step FFT algorithm improves performance by utilizing the cache memory effectively. We use the recursive six-step FFT algorithm to implement the parallel one-dimensional FFT algorithm. The performance results of one-dimensional FFTs on the K computer are reported. We successfully achieved a performance of over 18 TFlops on 8192 nodes of the K computer (82944 nodes, 128 GFlops/node, 10.6 PFlops peak performance) for a 241-point FFT.
Keywords :
cache storage; distributed memory systems; fast Fourier transforms; parallel processing; recursive functions; K computer; TFlops; cache memory; parallel one-dimensional FFT algorithm; parallel one-dimensional fast Fourier transform; parallel recursive six-step FFT algorithm; Arrays; Computers; Distributed databases; Equations; Indexes; Kernel; Registers; Fast Fourier transform; all-to-all communication; distributed-memory parallel computer;
Conference_Titel :
High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
Conference_Location :
Liverpool
Print_ISBN :
978-1-4673-2164-8
DOI :
10.1109/HPCC.2012.53