Title : 
Realtime speech-driven facial animation using Gaussian Mixture Models
         
        
            Author : 
Changwei Luo ; Yu Jun ; Xian Li ; Zengfu Wang
         
        
            Author_Institution : 
Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China
         
        
        
        
        
        
            Abstract : 
Synthesizing speech-driven facial animation is the process of animating a virtual face according to the input audio signal. Actually, audio-to-visual conversion is the core of speech-driven facial animation. In this paper, Gaussian Mixture Models (GMM) are employed for audio-to-visual conversion. The conventional GMM based method performs the conversion frame by frame using minimum mean square error estimation. We consider two issues related to the conventional method: the influence of previous visual features on current visual feature is not considered, and GMM training and conversion are inconsistent. To address these issues, we propose incorporating previous visual features into the conversion. We also propose a minimum conversion error based approach to refine the GMM parameters. Experiments on a public available database show that our method can accurately convert audio features into visual features. The conversion accuracy is comparable to a current state-of-the-art trajectory-based approach. Based on the proposed method, we develop a speech-driven facial animation system, the system runs in real time and outputs realistic speech animations.
         
        
            Keywords : 
Gaussian processes; computer animation; estimation theory; learning (artificial intelligence); mean square error methods; mixture models; speech synthesis; GMM; Gaussian mixture model; audio-to-visual conversion; minimum mean square error estimation; realtime speech-driven facial animation synthesis; trajectory-based approach; virtual face animating process; Facial animation; Principal component analysis; Shape; Training; Vectors; Visualization; Facial animation; GMM; audio-to-visual conversion; speech-driven;
         
        
        
        
            Conference_Titel : 
Multimedia and Expo Workshops (ICMEW), 2014 IEEE International Conference on
         
        
            Conference_Location : 
Chengdu
         
        
        
        
            DOI : 
10.1109/ICMEW.2014.6890554