Title :
A binaural model as a front-end for isolated word recognition
Author :
Usagawa, Tsuyoshi ; Bodden, Markus ; Rateitschek, Klaus
Author_Institution :
Kumamoto Univ., Japan
Abstract :
Small vocabulary isolated word speech recognition can be implemented on relatively small hardware. Although the recognition problem is more or less solved in noise free situations, the general application is hindered because of the dramatic decrease of performance in noisy environments, especially for hands free applications. A binaural front end for speech recognition is presented. This binaural model, which was originally developed at Ruhr University of Bochum in Germany, allows for an effective reduction of interfering noises of any kind. Besides stationary noises, concurrent speech signals can also be suppressed. The original model was designed as a precise computer model of the human binaural auditory system and can explain a variety of psycho acoustical phenomena. Besides those abilities the model offers sharp directional selectivity which is superior to those obtained with directional microphones. We simplified this sophisticated model by adapting it to the specific task and use the peak position and the peak level of the binaural activity pattern for each frequency band as a parameter for pattern matching. The performance was evaluated in the form of recognition rates for a variety of different noisy environments. The results show that the binaural front end leads to a significant improvement in recognition rates, corresponding to an enhancement of over 20dB in SNR in most cases
Keywords :
pattern matching; speech enhancement; speech processing; speech recognition; 20 dB; binaural activity pattern; binaural front end; binaural model; concurrent speech signals; frequency band; hands free applications; human binaural auditory system; interfering noises; noisy environments; pattern matching; peak position; psycho acoustical phenomena; recognition rates; sharp directional selectivity; small vocabulary isolated word speech recognition; stationary noises; Auditory system; Frequency; Hardware; Humans; Microphones; Noise reduction; Psychology; Speech recognition; Vocabulary; Working environment noise;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607280