DocumentCode
2892107
Title
Boosted audio-visual HMM for speech reading
Author
Yin, Pei ; Essa, Irfan ; Rehg, James M.
Author_Institution
GVU Center, Georgia Inst. of Technol., Atlanta, GA, USA
Volume
2
fYear
2003
fDate
9-12 Nov. 2003
Firstpage
2013
Abstract
We propose a new approach for combining acoustic and visual measurements to aid in recognizing lip shapes of a person speaking. Our method relies on computing the maximum likelihoods of (a) HMM used to model phonemes from the acoustic signal, and (b) HMM used to model visual features motions from video. One significant addition in this work is the dynamic analysis with features selected by AdaBoost, on the basis of their discriminant ability. This form of integration, leading to boosted HMM, permits AdaBoost to find the best features first, and then uses HMM to exploit dynamic information inherent in the signal.
Keywords
audio-visual systems; feature extraction; hidden Markov models; maximum likelihood estimation; speech processing; speech recognition; video signal processing; AdaBoost; acoustic measurement; acoustic signal; boosted audio-visual HMM; dynamic analysis; feature selection; hidden Markov model; lip shape recognition; maximum likelihood; phoneme model; speech reading; video signal; visual feature motion; visual measurement; Acoustic applications; Acoustic measurements; Educational institutions; Face detection; Hidden Markov models; Information analysis; Natural languages; Shape measurement; Signal analysis; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference on
Print_ISBN
0-7803-8104-1
Type
conf
DOI
10.1109/ACSSC.2003.1292334
Filename
1292334
Link To Document