Title :
Performance comparison of SIMD implementations of the discrete wavelet transform
Author :
Shahbahrami, Asadollah ; Juurlink, Ben ; Vassiliadis, Stamatis
Author_Institution :
Fac. of Electr. Eng., Math., & Comput. Sci., Delft Univ. of Technol., Netherlands
Abstract :
This paper focuses on SIMD implementations of the 2D discrete wavelet transform (DWT). The transforms considered are Daubechies´ real-to-real method of four coefficients (Daub-4) and the integer-to-integer (5, 3) lifting scheme. Daub-4 is implemented using SSE and the lifting scheme using MMX, and their performance is compared to C implementations on a Pentium 4 processor. The MMX implementation of the lifting scheme is up to 4.0× faster than the corresponding C program for a 1-level 2D DWT, while the SSE implementation of Daub-4 is up to 2.6× faster than the C version. It is shown that for some image sizes, the performance is significantly hampered by the so called 64K aliasing problem, which occurs in the Pentium 4 when two data blocks are accessed that are a multiple of 64K apart. It is also shown that for the (5, 3) lifting scheme, a 12-bit word size is sufficient for a 5-level decomposition of the 2D DWT for images of up to 10 bits per pixel.
Keywords :
digital arithmetic; discrete wavelet transforms; parallel processing; 64K aliasing problem; C program; Daub-4; Daubechies real-to-real method; SIMD implementation; discrete wavelet transform; integer-to-integer lifting scheme; Computer science; Convolution; Discrete wavelet transforms; Filter bank; Finite impulse response filter; Laboratories; Mathematics; Signal processing; Transform coding; Wavelet transforms; Discrete Wavelet Transform; SIMD extensions.; lifting scheme;
Conference_Titel :
Application-Specific Systems, Architecture Processors, 2005. ASAP 2005. 16th IEEE International Conference on
Print_ISBN :
0-7695-2407-9
DOI :
10.1109/ASAP.2005.51