Title :
Warped linear predictive speech coding
Author :
Wong, Chian-Hong ; Lim, Heng-Siong ; Tan, Alan Wee-Chiat
Author_Institution :
Fac. of Eng. & Technol., Multimedia Univ., Ayer Keroh, Malaysia
Abstract :
This project aims to enhance human speech energy at low frequencies using linear prediction methods to achieve better performance for speech recognizers in noisy conditions. In order to achieve this, a recognition system based on Warped Linear Prediction (WLP) is proposed. WLP is based on Warped Fourier Transform and the consideration is to warp a signal to another frequency scale and perform Fourier Transform on the warped scale. This technique will transform speech signal so that the frequency resolution at the lower frequency region is higher, thus more detailed information on the signal can be obtained from the low frequencies. After the signal is transformed through warping, cepstral coefficients are obtained and it can be acknowledged as Warped Linear Prediction. Evaluation of the effectiveness of this method has been conducted in isolated word recognition tests. Experimental results show that the WLP performs better than linear prediction method for the set SNR range, based on two distortion measures that were tried. The new method shows no degradation in recognition accuracy under high SNR conditions, but performs significantly better under low SNR conditions. At SNR of 4dB, performance improvements of up to 70 percent can be seen.
Keywords :
Fourier transforms; cepstral analysis; linear codes; signal resolution; speech coding; speech enhancement; speech recognition; WLP; cepstral coefficients; frequency region; frequency resolution; frequency scale; human speech energy enhancement; linear prediction methods; noisy conditions; recognition accuracy; recognition system; speech recognition; speech signal transformation; warped Fourier transform; warped linear prediction; warped linear predictive speech coding; warped scale; warping; word recognition tests; Cepstral analysis; Correlation; Distortion measurement; Fourier transforms; Signal to noise ratio; Speech; Speech recognition;
Conference_Titel :
Signal and Image Processing Applications (ICSIPA), 2011 IEEE International Conference on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4577-0243-3
DOI :
10.1109/ICSIPA.2011.6144160