Title :
On the use of wideband signal for noise robust ASR
Author :
D.A. Macho; Yan Ming Cheng
Author_Institution :
Human Interface Lab., Motorola Labs., Schaumburg, IL, USA
fDate :
6/25/1905 12:00:00 AM
Abstract :
Wideband audio signal will be commonly used in the near future telecommunication applications. In quiet environments, the speech recognition performance increases when using a wideband signal instead of a narrowband signal. For practical ASR systems, however, we are interested in whether we can benefit from a wideband signal when recognizing noisy speech. The wideband speech signal, with respect to the narrowband speech signal, contains high frequency components, which usually have low intensity and, thus, are vulnerable to noise distortions. The robustness of the wideband feature set may be worse than that of the narrowband feature set, if the added high-frequency components are distorted. We investigate whether the addition of information from high frequencies into the ASR feature set can improve the recognition performance of noisy speech. The differences between low- and high-frequency parts of the wideband speech spectrum suggest a separate processing of these two parts. We propose an algorithm in which the separate processing scheme permits us to reuse the noise robust front-end originally designed for a narrowband signal. A low complexity processing is designed for high frequency components, which usually bear less information. The high-frequency information is added to the narrowband speech features in a form of de-noised filter-bank energies. The energies are appended after or before computing the cepstral features. In the best case, we obtained 13.96% average relative improvement when recognizing wideband noisy speech with respect to the narrowband noisy speech performance. The proposed algorithm is part of the recently adopted ETSI standard for the advanced frontend for distributed speech recognition.
Keywords :
"Wideband","Noise robustness","Automatic speech recognition","Narrowband","Speech recognition","Frequency","Working environment noise","Speech enhancement","Distortion","Speech processing"
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP ´03). 2003 IEEE International Conference on
Print_ISBN :
0-7803-7663-3
DOI :
10.1109/ICASSP.2003.1202306