DocumentCode :
3246914
Title :
Neural network lipreading system for improved speech recognition
Author :
Stork, David G. ; Wolff, Greg ; Levine, Earl
Author_Institution :
Ricoh California Res. Center, Menlo Park, CA, USA
Volume :
2
fYear :
1992
fDate :
7-11 Jun 1992
Firstpage :
289
Abstract :
A modified time-delay neural network (TDNN) has been designed to perform both automatic lipreading (speech reading) in conjunction with acoustic speech recognition in order to improve recognition both in silent environments as well as in the presence of acoustic noise. The system is far more robust to acoustic noise and verbal distractors than is a system not incorporating visual information. Specifically, in the presence of high-amplitude pink noise, the low recognition rate in the acoustic only system (43%) is raised to 75% by the incorporation of visual information. The system responds to (artificial) conflicting cross-modal patterns in a way closely analogous to the McGurk effect in humans. The power of neural techniques is demonstrated in several difficult domains: pattern recognition; sensory integration; and distributed approaches toward `rule-based´ (linguistic-phonological) processing
Keywords :
acoustic noise; delays; neural nets; pattern recognition; speech recognition; McGurk effect; acoustic noise; conflicting cross-modal patterns; distributed approaches; high-amplitude pink noise; linguistic-phonological processing; neural network lipreading system; pattern recognition; rule based processing; sensory integration; speech recognition; time-delay neural network; 1f noise; Acoustic noise; Automatic speech recognition; Humans; Neural networks; Noise robustness; Pattern recognition; Speech enhancement; Speech recognition; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 1992. IJCNN., International Joint Conference on
Conference_Location :
Baltimore, MD
Print_ISBN :
0-7803-0559-0
Type :
conf
DOI :
10.1109/IJCNN.1992.226994
Filename :
226994
Link To Document :
بازگشت