Title :
Auditory pathway model and its VLSI implementation for robust speech recognition in real-world noisy environment
Author :
Lee, Soo-Young ; Kim, Chang-Min ; Won, Young-Gul ; Park, Hyung-Min
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Korea Adv. Inst. of Sci. & Technol., South Korea
Abstract :
A robust speech recognition system is reported based on mathematical models of auditory pathway and also their VLSI implementations. The developed auditory model consists of 3 components, i.e., nonlinear feature extraction at cochlea, binaural processing at superior olivery complex, and top-down attention through backward path. The feature extraction is based on cochlear filter bank and time-frequency masking, which is modeled with lateral inhibition in both time and frequency domain. Unlike the popular binaural processing models based on simple interaural time delay and interaural intensity difference our model incorporates hundreds of time-delays for noisy reverberated signals. The top-down (TD) attention comes from familiarity and/or importance of the sound, and a simple but efficient TD attention model had been developed based on error backpropagation algorithm. These auditory models require intensive computing, and special hardwares had been developed for real-time applications. Experimental results demonstrate much better recognition performance in real-world noisy environments.
Keywords :
VLSI; backpropagation; delays; feature extraction; speech recognition; VLSI; auditory pathway model; binaural processing; cochlear filter bank; error backpropagation algorithm; frequency domain; interaural intensity difference; interaural time delay; mathematical models; noisy reverberated signals; nonlinear feature extraction; real world noisy environment; robust speech recognition; time domain; time frequency masking; top down attention; very large scale integration; Acoustic noise; Feature extraction; Filter bank; Frequency domain analysis; Mathematical model; Robustness; Speech recognition; Time frequency analysis; Very large scale integration; Working environment noise;
Conference_Titel :
Neural Networks and Signal Processing, 2003. Proceedings of the 2003 International Conference on
Conference_Location :
Nanjing
Print_ISBN :
0-7803-7702-8
DOI :
10.1109/ICNNSP.2003.1281219