DocumentCode
34682
Title
The Fastest Fourier Transform in the South
Author
Blake, Anthony M. ; Witten, I.H. ; Cree, Michael J.
Author_Institution
Dept. of Comput. Sci., Univ. of Waikato, Hamilton, New Zealand
Volume
61
Issue
19
fYear
2013
fDate
Oct.1, 2013
Firstpage
4707
Lastpage
4716
Abstract
This paper describes FFTS, a discrete Fourier transform (DFT) library that achieves state-of-the-art performance using a new cache-oblivious algorithm implemented with run-time specialization. During initialization the transform parameters, such as sign and direction, are fixed, allowing sequencing and data access information to be precomputed and specialized machine code to be generated. The resulting code may be executed any number of times on different input data. In contrast to FFTW, FFTS does not use large base cases at the leaves of recursion, nor a substantial library of codelets, and does not require time-consuming machine-specific calibration. The code presented in this paper has been benchmarked on recent Intel x86 and ARM machines, and is, in almost all cases, faster than self-tuning libraries such as FFTW, and even vendor-tuned libraries such as Intel IPP and Apple vDSP.
Keywords
cache storage; discrete Fourier transforms; program compilers; software libraries; ARM machines; Apple vDSP; DFT library; FFTS; FFTW; Intel IPP; Intel x86; cache-oblivious algorithm; codelets; data access information; discrete Fourier transform; dynamic code generation; fastest Fourier transform; run-time specialization; self-tuning libraries; specialized machine code; transform parameters; vendor-tuned libraries; Discrete Fourier transforms; Libraries; Prediction algorithms; Runtime; Signal processing algorithms; Dynamic code generation; FFT; Fourier transform; run-time specialization;
fLanguage
English
Journal_Title
Signal Processing, IEEE Transactions on
Publisher
ieee
ISSN
1053-587X
Type
jour
DOI
10.1109/TSP.2013.2273199
Filename
6557535
Link To Document