مرکز منطقه ای اطلاع رساني علوم و فناوري - Bimodal sensor integration on the example of `speechreading´

DocumentCode :

1904506

Title :

Bimodal sensor integration on the example of `speechreading´

Author :

Bregler, Christoph ; Manke, Stefan ; Hild, Hermann ; Waibel, Alex

Author_Institution :

Dept. of Comput. Sci., Karlsruhe Univ., Germany

fYear :

1993

fDate :

1993

Firstpage :

667

Abstract :

It is shown how recognition performance in automated speech preception can be significantly improved by additional lipreading, so called speech-reading. It is shown on an extension of an existing state-of-the-art speech recognition system, a modular multi-state time-delay neural network (MS-TDNN). The acoustic and visual speech data are preclassified in two separate front-end phoneme TDNNs and combined to acoustic-visual hypotheses for the dynamic time warping algorithm. This is shown on a connected word recognition problem, the letter spelling task. With speech-reading the error rate can be reduced up to half of the error rate of pure acoustic recognition

Keywords :

neural nets; speech recognition; acoustic-visual hypotheses; automated speech preception; bimodal sensor; lipreading; modular multistatic time delay neural net; speech recognition; speech-reading; Acoustic devices; Auditory system; Cameras; Computer architecture; Computer science; Error analysis; Humans; Microphones; Signal processing; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 1993., IEEE International Conference on

Conference_Location :

San Francisco, CA

Print_ISBN :

0-7803-0999-5

Type :

conf

DOI :

10.1109/ICNN.1993.298634

Filename :

298634

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1904506