Dimensional emotion driven facial expression synthesis based on the multi-stream DBN model

Author

Hao Wu ; Dongmei Jiang ; Yong Zhao ; Sahli, Hichem

Author_Institution

VUB-NPU Joint Res. Group on AVSP, Northwestern Polytech. Univ., Xi´an, China

fYear

2012

fDate

3-6 Dec. 2012

Firstpage

1

Lastpage

6

Abstract

This paper proposes a dynamic Bayesian network (DBN) based MPEG-4 compliant 3D facial animation synthesis method driven by the (Evaluation, Activation) values in the continuous emotion space. For each emotion, a state synchronous DBN model (SS_DBN) is firstly trained using the Cohn-Kanade (CK) database with two streams of inputs: (i) the annotated (Evaluation, Activation) values, and (ii) the extracted Facial Action Parameters (FAPs) of the face image sequences. Then given an input (Evaluation, Activation) sequence, the optimal FAP sequence is estimated via the maximum likelihood estimation (MLE) criterion, and then used to construct the MPEG-4 compliant 3D facial animation. Compared with the state-of-the-art approaches where the mapping between the emotional space and the FAPs has been made empirically, in our approach the mapping is learned and optimized using DBN to fit the input (Evaluation, Activation) sequence. Emotion recognition results on the constructed facial animations, as well as subjective evaluations, show that the proposed method obtains natural facial animations representing well the dynamic process of the emotions from neutral to exaggerate.

Keywords

belief networks; computer animation; emotion recognition; face recognition; feature extraction; maximum likelihood estimation; visual databases; CK database; Cohn-Kanade database; DBN-based MPEG-4 compliant 3D facial animation synthesis method; MLE criterion; SS-DBN; annotated values; continuous emotion space; dimensional emotion driven facial expression synthesis; dynamic Bayesian network-based MPEG-4 compliant 3D facial animation synthesis method; emotion recognition; emotional space; emotions dynamic process; face image sequences; facial action parameters extraction; maximum likelihood estimation criterion; multistream DBN model; optimal FAP sequence; state synchronous DBN model; state-of-the-art approaches; Face; Facial animation; Hidden Markov models; Image sequences; Maximum likelihood estimation; Speech; Transform coding;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific

Conference_Location

Hollywood, CA

Print_ISBN

978-1-4673-4863-8

Type

conf

Filename

6411899