مرکز منطقه ای اطلاع رساني علوم و فناوري - Trainable videorealistic speech animation

DocumentCode :

3022044

Title :

Trainable videorealistic speech animation

Author :

Ezzat, Tony ; Geiger, Gadi ; Poggio, Tomaso

Author_Institution :

Center for Biol. & Comput. Learning, Massachusetts Inst. of Technol., Cambridge, MA, USA

fYear :

2004

fDate :

17-19 May 2004

Firstpage :

Lastpage :

Abstract :

We describe how to create with machine learning techniques a generative, videorealistic, and speech animation module. A human subject is first recorded using a videocamera as he/she utters a pre-determined speech corpus. After processing the corpus automatically, a visual speech module is learned from the data that is capable of synthesizing the human subject´s mouth uttering entirely novel utterances that were not recorded in the original video. The synthesized utterance is re-composited onto a background sequence, which contains natural head and eye movement. The final output is videorealistic in the sense that it looks like a video camera recording of the subject. At run time, the input to the system can be either real audio sequences or synthetic audio produced by a text-to-speech system, as long as they have been phonetically aligned.

Keywords :

computer animation; face recognition; image sequences; learning (artificial intelligence); speech synthesis; video cameras; machine learning techniques; synthesized utterance; text-to-speech system; video camera; videorealistic speech animation; visual speech module; Animation; Audio recording; Cameras; Humans; Machine learning; Magnetic heads; Mouth; Speech processing; Speech synthesis; Video recording;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Face and Gesture Recognition, 2004. Proceedings. Sixth IEEE International Conference on

Print_ISBN :

0-7695-2122-3

Type :

conf

DOI :

10.1109/AFGR.2004.1301509

Filename :

1301509

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3022044