DocumentCode :
2892079
Title :
Bayesian networks in multimodal speech recognition and speaker identification
Author :
Nefian, Ara V. ; Liang, Lu Hong
Volume :
2
fYear :
2003
fDate :
9-12 Nov. 2003
Firstpage :
2004
Abstract :
Bayesian networks are statistical models that extend the framework of hidden Markov models (HMM) and allow for the analysis of multi modal signals such as audio-visual speech. Our recent results demonstrate the use of coupled HMM in audio-visual speech recognition and speaker identification. The increased performance of this model is due to its low complexity and its ability to describe both the audio-visual state asynchrony and natural dependency over time. The audio-visual speaker identification accuracy is enhanced in a late decision approach that integrates the audio-visual speech likelihood and the face likelihood computed using an embedded Bayesian network.
Keywords :
audio-visual systems; belief networks; face recognition; hidden Markov models; speaker recognition; speech processing; Bayesian network; audio-visual speech likelihood computation; audio-visual speech recognition; audio-visual state asynchrony; coupled HMM; face likelihood computation; hidden Markov model; multimodal signal; multimodal speech recognition; speaker identification; statistical model; Audio-visual systems; Bayesian methods; Computer networks; Hidden Markov models; Intelligent networks; Signal analysis; Speech analysis; Speech enhancement; Speech recognition; Spine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference on
Print_ISBN :
0-7803-8104-1
Type :
conf
DOI :
10.1109/ACSSC.2003.1292332
Filename :
1292332
Link To Document :
بازگشت