DocumentCode :
1091279
Title :
Speech-driven facial animation using a hierarchical model
Author :
Cosker, D.P. ; Marshall, A.D. ; Rosin, P.L. ; Hicks, Y.A.
Author_Institution :
Sch. of Comput. Sci., Cardiff Univ., UK
Volume :
151
Issue :
4
fYear :
2004
Firstpage :
314
Lastpage :
321
Abstract :
A system capable of producing near video-realistic animation of a speaker given only speech inputs is presented. The audio input is a continuous speech signal, requires no phonetic labelling and is speaker-independent. The system requires only a short video training corpus of a subject speaking a list of viseme-targeted words in order to achieve convincing realistic facial synthesis. The system learns the natural mouth and face dynamics of a speaker to allow new facial poses, unseen in the training video, to be synthesised. To achieve this the authors have developed a novel approach which utilises a hierarchical and nonlinear principal components analysis (PCA) model which couples speech and appearance. Animation of different facial areas, defined by the hierarchy, is performed separately and merged in post-processing using an algorithm which combines texture and shape PCA data. It is shown that the model is capable of synthesising videos of a speaker using new audio segments from both previously heard and unheard speakers.
Keywords :
audio signal processing; computer animation; principal component analysis; speech processing; video signal processing; cluster modelling; hierarchical facial model; principal component analysis model; speech signal; speech-driven facial animation;
fLanguage :
English
Journal_Title :
Vision, Image and Signal Processing, IEE Proceedings -
Publisher :
iet
ISSN :
1350-245X
Type :
jour
DOI :
10.1049/ip-vis:20040752
Filename :
1331219
Link To Document :
بازگشت