• DocumentCode
    3335573
  • Title

    Expressive Visual Text-to-Speech Using Active Appearance Models

  • Author

    Anderson, Richard ; Stenger, Bjorn ; Wan, Vincent ; Cipolla, Roberto

  • Author_Institution
    Dept. of Eng., Univ. of Cambridge, Cambridge, UK
  • fYear
    2013
  • fDate
    23-28 June 2013
  • Firstpage
    3382
  • Lastpage
    3389
  • Abstract
    This paper presents a complete system for expressive visual text-to-speech (VTTS), which is capable of producing expressive output, in the form of a ´talking head´, given an input text and a set of continuous expression weights. The face is modeled using an active appearance model (AAM), and several extensions are proposed which make it more applicable to the task of VTTS. The model allows for normalization with respect to both pose and blink state which significantly reduces artifacts in the resulting synthesized sequences. We demonstrate quantitative improvements in terms of reconstruction error over a million frames, as well as in large-scale user studies, comparing the output of different systems.
  • Keywords
    pose estimation; speech processing; text analysis; AAM; VTTS; active appearance models; blink state; expressive visual text-to-speech; pose state; Active appearance model; Face; Hidden Markov models; Shape; Speech; Three-dimensional displays; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on
  • Conference_Location
    Portland, OR
  • ISSN
    1063-6919
  • Type

    conf

  • DOI
    10.1109/CVPR.2013.434
  • Filename
    6619278