Title :
Mel-scaled discrete wavelet coefficients for speech recognition
Author :
Gowdy, J.N. ; Tufekci, Z.
Author_Institution :
Dept. of Electr. & Comput. Eng., Clemson Univ., SC, USA
Abstract :
In this paper we propose a new feature vector consisting of mel-frequency discrete wavelet coefficients (MFDWC). The MFDWC are obtained by applying the discrete wavelet transform (DWT) to the mel-scaled log filterbank energies of a speech frame. The purpose of using the DWT is to benefit from its localization property in the time and frequency domains. MFDWC are similar to subband-based (SUB) features and multi-resolution (MULT) features in that both attempt to achieve good time and frequency localization. However, MFDWC have better time/frequency localization than SUB features and MULT features. We evaluated the performance of new features for clean speech and noisy speech and compared the performance of MFDWC with mel-frequency cepstral coefficients (MFCC), SUB features and MULT features. Experimental results on a phoneme recognition task showed that a MFDWC-based recognizer gave better results than recognizers based on MFCC, SUB features, and MULT features for the white gaussian noise, band-limited white gaussian noise and clean speech cases
Keywords :
Gaussian noise; channel bank filters; discrete wavelet transforms; speech recognition; time-frequency analysis; Mel-scaled discrete wavelet coefficients; band-limited white gaussian noise; clean speech; discrete wavelet transform; feature vector; frequency domain; mel-frequency cepstral coefficients; mel-frequency discrete wavelet coefficients; mel-scaled log filterbank energies; noisy speech; performance; phoneme recognition task; speech frame; speech recognition; time domain; time/frequency localization; white gaussian noise; Cepstral analysis; Discrete wavelet transforms; Filter bank; Frequency domain analysis; Gaussian noise; Mel frequency cepstral coefficient; Speech analysis; Speech enhancement; Speech recognition; Wavelet coefficients;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.861829