DocumentCode
2255107
Title
Spectral estimation and normalisation for robust speech recognition
Author
Claes, Tom ; Xie, Fei ; Van Compernolle, Dirk
Author_Institution
ESAT, Katholieke Univ., Leuven, Belgium
Volume
4
fYear
1996
fDate
3-6 Oct 1996
Firstpage
1997
Abstract
Speech recognition in adverse conditions remains a difficult but challenging problem. It has already been shown that normalisation of the dynamic range (SNR) of the frequency channels in a mel scale triangular filter bank (MFCC) improves the robustness against both additive and convolutional noise. Nevertheless, because the method is based on a masking-technique, the improvement is small in the case of SNR values that are smaller than the target (normalised) SNR. A solution for this problem can be found in first enhancing the filter bank energies before the masking-technique is applied. For this purpose the authors developed a non-linear spectral estimator (NSE) for speech recognition that operates on the log filter bank energies. NSE enhances these filter bank energies and makes use of SNR-normalisation also effective at very low SNRs. Experimental results are given on the NOISEX-92 database. Better recognition performance is seen even at 0 dB SMR
Keywords
acoustic filters; estimation theory; spectral analysis; speech recognition; NOISEX-92 database; additive noise; convolutional noise; filter bank energies; log filter bank energies; masking technique; mel scale triangular filter bank; nonlinear spectral estimator; normalisation; recognition performance; robust speech recognition; spectral estimation; Additive noise; Convolution; Dynamic range; Filter bank; Neutron spin echo; Noise figure; Robustness; Signal to noise ratio; Speech recognition; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607189
Filename
607189
Link To Document