• DocumentCode
    794845
  • Title

    Real-time speech-driven face animation with expressions using neural networks

  • Author

    Hong, Pengyu ; Wen, Zhen ; Huang, Thomas S.

  • Author_Institution
    Beckman Inst. for Adv. Sci. & Technol., Illinois Univ., Urbana, IL, USA
  • Volume
    13
  • Issue
    4
  • fYear
    2002
  • fDate
    7/1/2002 12:00:00 AM
  • Firstpage
    916
  • Lastpage
    927
  • Abstract
    A real-time speech-driven synthetic talking face provides an effective multimodal communication interface in distributed collaboration environments. Nonverbal gestures such as facial expressions are important to human communication and should be considered by speech-driven face animation systems. In this paper, we present a framework that systematically addresses facial deformation modeling, automatic facial motion analysis, and real-time speech-driven face animation with expression using neural networks. Based on this framework, we learn a quantitative visual representation of the facial deformations, called the motion units (MUs). A facial deformation can be approximated by a linear combination of the MUs weighted by MU parameters (MUPs). We develop an MU-based facial motion tracking algorithm which is used to collect an audio-visual training database. Then, we construct a real-time audio-to-MUP mapping by training a set of neural networks using the collected audio-visual training database. The quantitative evaluation of the mapping shows the effectiveness of the proposed approach. Using the proposed method, we develop the functionality of real-time speech-driven face animation with expressions for the iFACE system. Experimental results show that the synthetic expressive talking face of the iFACE system is comparable with a real face in terms of the effectiveness of their influences on bimodal human emotion perception.
  • Keywords
    computer animation; face recognition; gesture recognition; neural nets; real-time systems; speech recognition; audio-visual training database; automatic facial motion analysis; bimodal human emotion perception; distributed collaboration environments; facial deformation modeling; facial expressions; iFACE system; motion units; multimodal communication interface; neural networks; nonverbal gestures; quantitative evaluation; quantitative visual representation; real-time audio-to-MUP mapping; real-time speech-driven face animation; real-time speech-driven synthetic talking face; Audio databases; Collaboration; Deformable models; Face; Facial animation; Humans; Linear approximation; Motion analysis; Neural networks; Real time systems;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/TNN.2002.1021892
  • Filename
    1021892