Title :
Baum-Welch hidden Markov model inversion for reliable audio-to-visual conversion
Author :
Choi, KyouugHo ; Hwang, Jenq-Neng
Author_Institution :
Dept. of Electr. Eng., Washington Univ., Seattle, WA, USA
Abstract :
In this paper, a novel audio-to-visual conversion method is presented. Many multimedia applications, such as videophones, videoconferencing, man-machine interface, language dubbing, character animation in virtual reality, etc., require techniques for synchronizing audio and video in a synthesized talking head sequence. For these applications, it is necessary to reliably estimate accurate mouth (visual) movements from the corresponding speech (audio) data. The hidden Markov model inversion (HMMI) technique introduced for robust speech recognition is extended in this paper into the audio-visual feature space. Based on the Baum-Welch HMMI method, reliable visual parameters are extracted given speech data only. Our preliminary simulation results show that the estimated visual parameters from the proposed method match the true visual parameters smoothly as well as accurately. The proposed estimation technique can be combined with video coding and graphics techniques for other multimedia applications
Keywords :
audio signal processing; hidden Markov models; image sequences; multimedia systems; speech recognition; Baum-Welch hidden Markov model inversion; accurate mouth movements; audio/video synchronisation; character animation; graphics techniques; language dubbing; man-machine interface; multimedia applications; reliable audio-to-visual conversion; reliable visual parameter extraction; speech data; synthesised talking head sequence; video coding; videoconferencing; videophone; virtual reality; Animation; Data mining; Hidden Markov models; Mouth; Robustness; Speech recognition; Speech synthesis; Teleconferencing; User interfaces; Virtual reality;
Conference_Titel :
Multimedia Signal Processing, 1999 IEEE 3rd Workshop on
Conference_Location :
Copenhagen
Print_ISBN :
0-7803-5610-1
DOI :
10.1109/MMSP.1999.793816