Title :
Radix-4 FFT implementation using SIMD multimedia instructions
Author :
Nadehara, Kouhei ; Miyazaki, Takashi ; Kuroda, Ichiro
Author_Institution :
C&C Media Res. Labs., NEC Corp., Kawasaki, Japan
Abstract :
A fast radix-4 complex FFT implementation using 4-parallel SIMD instructions is presented. Four radix-4 butterflies are calculated in parallel at all stages by loading consecutive 4 elements into a register. At the last stage, every 4 elements is packed into a register and calculated in parallel. This regular data flow enables higher parallelism and an overhead reduction in data format conversion. The implementation result on the V830R processor, which has a 4-parallel SIMD-type multimedia instruction set, achieves practical performance quite competitive with high-end parallel DSPs. Multiply-accumulate instructions with symmetrical rounding introduced to the V830R processor are effective to maintain FFT accuracy
Keywords :
digital arithmetic; digital signal processing chips; fast Fourier transforms; instruction sets; multimedia systems; parallel architectures; parallel programming; reduced instruction set computing; FFT accuracy; SIMD multimedia instructions; SIMD-type multimedia instruction set; V830R processor; data format conversion; fast radix-4 complex FFT implementation; multiply-accumulate instructions; overhead reduction; parallel DSP; parallel SIMD instructions; performance; processor architecture; radix-4 butterflies; register; regular data flow; symmetrical rounding; Digital signal processing; Discrete Fourier transforms; Hardware; Microprocessors; Parallel processing; Registers; Signal processing; Signal processing algorithms; Streaming media; Video signal processing;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
Conference_Location :
Phoenix, AZ
Print_ISBN :
0-7803-5041-3
DOI :
10.1109/ICASSP.1999.758355