مرکز منطقه ای اطلاع رساني علوم و فناوري - State Synchronous Modeling on Phone Boundary for Audio Visual Speech Recognition and Application to Muti-View Face Images

DocumentCode :

2702029

Title :

State Synchronous Modeling on Phone Boundary for Audio Visual Speech Recognition and Application to Muti-View Face Images

Author :

Kumatani, Kenichi ; Stiefelhagen, Rainer

Author_Institution :

Interactive Syst. Labs., Karlsruhe Univ., Germany

Volume :

fYear :

2007

fDate :

15-20 April 2007

Abstract :

Visual speech cues are known to improve the performance of automatic speech recognition (ASR). However, many researchers have used speaker´s frontal pose mainly. We therefore introduce a new database for large vocabulary audio visual automatic speech recognition (AV-ASR), which contains not only frontal face images but also face images taken from different angles (multi-view face images). Another contribution of this paper is to present a new algorithm which can model audio and visual characteristics between phones. Finally we conducted large vocabulary continuous speech recognition experiments on the new database using the new algorithm. Experimental results show that the proposed AV-ASR system achieved high accuracy even if there are mismatches of the views between training and test data.

Keywords :

audio-visual systems; face recognition; speech recognition; audio visual speech recognition; muti-view face images; phone boundary; state synchronous modeling; vocabulary continuous speech recognition; Automatic speech recognition; Face detection; Feature extraction; Hidden Markov models; Image databases; Linear discriminant analysis; Mouth; Speech recognition; Visual databases; Vocabulary; Audio visual automatic speech recognition; multi-view; product HMM; visual information;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on

Conference_Location :

Honolulu, HI

ISSN :

1520-6149

Print_ISBN :

1-4244-0727-3

Type :

conf

DOI :

10.1109/ICASSP.2007.366938

Filename :

4218126

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2702029