• DocumentCode
    2196495
  • Title

    Audio-Visual Speech Synthesis Based on Chinese Visual Triphone

  • Author

    Zhao, Hui ; Chen, Yue-Bing ; Shen, Ya-Min ; Tang, Chao-Jing

  • Author_Institution
    Coll. of Electron. Sci. & Eng., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2009
  • fDate
    17-19 Oct. 2009
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    A new audio-visual speech synthesis approach is proposed based on Chinese visual triphone. Chinese visual triphone model is constructed using a new clustering method combining artificial immune system and FCM. In the analysis stage, with the training phonetic transcription, visual triphone segments are selected from video sequence, and corresponding lip feature vectors are extracted. In the synthesis stage, viterbi search algorithm is used to select the best visual triphone segments by finding out a path which produces the minimum cost. According to the concatenation principles, mouth animation is generated and stitched into background video. Experimental results show that the synthesized video is natural-looking and satisfactory.
  • Keywords
    artificial immune systems; audio-visual systems; computer animation; speech synthesis; video signal processing; Chinese visual triphone; FCM; artificial immune system; audio-visual speech synthesis; mouth animation; phonetic transcription; synthesized video; video sequence; viterbi search algorithm; Animation; Artificial immune systems; Clustering algorithms; Clustering methods; Costs; Feature extraction; Mouth; Speech synthesis; Video sequences; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image and Signal Processing, 2009. CISP '09. 2nd International Congress on
  • Conference_Location
    Tianjin
  • Print_ISBN
    978-1-4244-4129-7
  • Electronic_ISBN
    978-1-4244-4131-0
  • Type

    conf

  • DOI
    10.1109/CISP.2009.5305612
  • Filename
    5305612