3D realistic talking face co-driven by text and speech

Author

Song, Mingli ; Chen, Chun ; Bu, Jiajun ; Liang, Ronghua

Author_Institution

Coll. of Comput. Sci., Zhejiang Univ., Hangzhou, China

Volume

3

fYear

2003

fDate

5-8 Oct. 2003

Firstpage

2175

Abstract

To create 3D realistic talking face has been a challenge for a long time. Previous works emphasize text or speech driven talking face respectively while the animation result is not very realistic or natural-looking. In the proposed approach, text and speech are considered to drive the 3D talkingface coordinately. The text is translated into a sequence of visemes´ transcription. And time vector of the sequence is extracted from the speech corresponding to the text after it is segmented into phonetic sequence. A muscle based viseme vector is defined for static viseme. And then, with the time vector and the static visemes´s sequence, dynamic visemes are generated through time-related dominance function. Finally, according to the frame rate to be rendered, intermediate frames are interpolated between key frames to make the animation result looks more natural and realistic than those obtained based on the text or speech-driven only.

Keywords

computer animation; image morphing; image sequences; speech processing; speech synthesis; 3D realistic talking face; animation result; dynamic visemes; frame rate; intermediate frames; muscle based viseme vector; phonetic sequence; speech driven talking face; static visemes sequence; text driven talking face; time vector; time-related dominance function; visemes transcription; Cities and towns; Computer science; Educational institutions; Facial animation; Muscles; Performance analysis; Speech processing; Speech synthesis; Text processing; Virtual reality;

fLanguage

English

Publisher

ieee

Conference_Titel

Systems, Man and Cybernetics, 2003. IEEE International Conference on

ISSN

1062-922X

Print_ISBN

0-7803-7952-7

Type

conf

DOI

10.1109/ICSMC.2003.1244206

Filename

1244206