Title :
Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms
Author :
Carnero, Benito ; Drygajlo, Andrzej
Author_Institution :
STMicroelectron. NV, Geneva, Switzerland
fDate :
6/1/1999 12:00:00 AM
Abstract :
This paper presents new wideband speech coding and integrated speech coding-enhancement systems based on frame-synchronized fast wavelet packet transform algorithms. It also formulates temporal and spectral psychoacoustic models of masking adapted to wavelet packet analysis. The algorithm of the proposed FFT-like overlapped block orthogonal wavelet packet transform permits us to efficiently approximate the auditory critical band decomposition in the time and frequency domains. This allows us to make use of the temporal and spectral masking properties of the human auditory system to decrease the average bit rate of the encoder while perceptually hiding the quantization error. The same wavelet packet representation is used to merge speech enhancement and coding in the context of auditory modeling. The advantage of the method presented in this paper over previous approaches is that perceptual enhancement and coding, which is usually implemented as a cascade of two separate systems, are combined. This leads to a decreased computational load. Experiments show that the proposed wideband coding procedure by itself can achieve transparent coding of speech signals sampled at 16 kHz at an average bit rate of 39.4 kbit/s. The combined speech coding-enhancement procedure achieves higher bit rate values that depend on the residual noise characteristics at the output of the enhancement process
Keywords :
discrete wavelet transforms; fast Fourier transforms; hearing; noise; quantisation (signal); signal representation; signal sampling; speech coding; speech enhancement; speech intelligibility; synchronisation; time-frequency analysis; 16 kHz; 39.4 kbit/s; FFT; auditory critical band decomposition; auditory modeling; average bit rate; computational load reduction; discrete wavelet packet transform; encoder; experiments; fast wavelet packet transform algorithms; frame-synchronized algorithms; frequency domain; human auditory system; integrated speech coding-enhancement systems; overlapped block orthogonal wavelet packet transform; perceptual speech coding; perceptual speech enhancement; quantization error; residual noise characteristics; spectral masking; spectral psychoacoustic models; speech signals sampling; temporal masking; temporal psychoacoustic models; time domain; transparent coding; wavelet packet analysis; wavelet packet representation; wideband speech coding; Bit rate; Frequency domain analysis; Psychoacoustic models; Speech coding; Speech enhancement; Wavelet analysis; Wavelet domain; Wavelet packets; Wavelet transforms; Wideband;
Journal_Title :
Signal Processing, IEEE Transactions on