Title :
Spectral estimation and normalisation for robust speech recognition
Author :
Claes, Tom ; Xie, Fei ; Van Compernolle, Dirk
Author_Institution :
ESAT, Katholieke Univ., Leuven, Belgium
Abstract :
Speech recognition in adverse conditions remains a difficult but challenging problem. It has already been shown that normalisation of the dynamic range (SNR) of the frequency channels in a mel scale triangular filter bank (MFCC) improves the robustness against both additive and convolutional noise. Nevertheless, because the method is based on a masking-technique, the improvement is small in the case of SNR values that are smaller than the target (normalised) SNR. A solution for this problem can be found in first enhancing the filter bank energies before the masking-technique is applied. For this purpose the authors developed a non-linear spectral estimator (NSE) for speech recognition that operates on the log filter bank energies. NSE enhances these filter bank energies and makes use of SNR-normalisation also effective at very low SNRs. Experimental results are given on the NOISEX-92 database. Better recognition performance is seen even at 0 dB SMR
Keywords :
acoustic filters; estimation theory; spectral analysis; speech recognition; NOISEX-92 database; additive noise; convolutional noise; filter bank energies; log filter bank energies; masking technique; mel scale triangular filter bank; nonlinear spectral estimator; normalisation; recognition performance; robust speech recognition; spectral estimation; Additive noise; Convolution; Dynamic range; Filter bank; Neutron spin echo; Noise figure; Robustness; Signal to noise ratio; Speech recognition; Working environment noise;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607189