Title :
Acoustic driven viseme identification for face animation
Author :
Zhong, Jialin ; Chou, Wu ; Petajan, Eric
Author_Institution :
Lucent Technol., AT&T Bell Labs., Murray Hill, NJ, USA
Abstract :
Unlike other image templates, visemes have identities in two different media. In audio domain, they are often related to basic linguistic units such as phonemes. In image domain, they are defined by the images of human articulators, such as mouth shapes, chin movements, etc. In this paper, an approach of extracting visemes from both image and acoustic domains is presented. In image domain, the mouth shapes, represented by feature points on inner lip contours, are extracted through face tracking and mouth image analysis. In acoustic domain, viseme segments are obtained automatically by aligning phoneme strings to audio signals through a Viterbi alignment process
Keywords :
Viterbi detection; computer animation; feature extraction; image matching; speech processing; speech synthesis; Viterbi alignment process; acoustic driven viseme identification; audio signals; basic linguistic units; chin movements; face animation; human articulators; inner lip contours; mouth shapes; phoneme strings; phonemes; Face; Facial animation; Hidden Markov models; Humans; Image segmentation; Image sequences; Mouth; Shape; Speech synthesis; Viterbi algorithm;
Conference_Titel :
Multimedia Signal Processing, 1997., IEEE First Workshop on
Conference_Location :
Princeton, NJ
Print_ISBN :
0-7803-3780-8
DOI :
10.1109/MMSP.1997.602605