DocumentCode :
1155898
Title :
Emphatic Visual Speech Synthesis
Author :
Melenchón, Javier ; Martínez, Elisa ; De la Torre, Fernando ; Montero, José A.
Author_Institution :
Estudis dTnformatica, Multimedia i Telecomunicacio, Univ. Oberta de Catalunya, Barcelona
Volume :
17
Issue :
3
fYear :
2009
fDate :
3/1/2009 12:00:00 AM
Firstpage :
459
Lastpage :
468
Abstract :
The synthesis of talking heads has been a flourishing research area over the last few years. Since human beings have an uncanny ability to read people´s faces, most related applications (e.g., advertising, video-teleconferencing) require absolutely realistic photometric and behavioral synthesis of faces. This paper proposes a person-specific facial synthesis framework that allows high realism and includes a novel way to control visual emphasis (e.g., level of exaggeration of visible articulatory movements of the vocal tract). There are three main contributions: a geodesic interpolation with visual unit selection, a parameterization of visual emphasis, and the design of minimum size corpora. Perceptual tests with human subjects reveal high realism properties, achieving similar perceptual scores as real samples. Furthermore, the visual emphasis level and two communication styles show a statistical interaction relationship.
Keywords :
face recognition; interpolation; speech synthesis; emphatic visual speech synthesis; geodesic interpolation; person-specific facial behavioral synthesis framework; photometry; statistical interaction; Advertising; Buildings; Control system synthesis; Face; Humans; Interpolation; Photometry; Speech synthesis; Testing; Transducers; Audiovisual speech synthesis; emphatic visual-speech; talking head;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2008.2010213
Filename :
4782040
Link To Document :
بازگشت