A Practical Model for Live Speech-Driven Lip-Sync

Author

Li Wei ; Zhigang Deng

Author_Institution

Dept. of Comput. Sci., Univ. of Houston, Houston, TX, USA

Volume

35

Issue

2

fYear

2015

fDate

Mar.-Apr. 2015

Firstpage

70

Lastpage

78

Abstract

This article introduces a simple, efficient, yet practical phoneme-based approach to generate realistic speech animation in real time based on live speech input. Specifically, the authors first decompose lower-face movements into low-dimensional principal component spaces. Then, in each of the retained principal component spaces, they select the AnimPho with the highest priority value and the minimum smoothness energy. Finally, they apply motion blending and interpolation techniques to compute final animation frames for the currently inputted phoneme. Through many experiments and comparisons, the authors demonstrate the realism of synthesized speech animation by their approach as well as its real-time efficiency on an off-the-shelf computer.

Keywords

computer animation; interpolation; speech processing; AnimPho; animation frames; interpolation techniques; live speech input; live speech-driven lip-sync; low-dimensional principal component spaces; lower-face movements; minimum smoothness energy; motion blending; phoneme-based approach; priority value; realistic speech animation; synthesized speech animation; Animation; Interpolation; Motion segmentation; Real-time systems; Speech processing; Speech recognition; computer graphics; facial animation; live speech driven; speech animation; talking avatars; virtual humans;

fLanguage

English

Journal_Title

Computer Graphics and Applications, IEEE

Publisher

ieee

ISSN

0272-1716

Type

jour

DOI

10.1109/MCG.2014.105

Filename

6894485