Title :
Exploring Cross-Modality Affective Reactions for Audiovisual Emotion Recognition
Author :
Mariooryad, S. ; Busso, Carlos
Author_Institution :
Multimodal Signal Process. Lab., Univ. of Texas at Dallas, Richardson, TX, USA
Abstract :
Psycholinguistic studies on human communication have shown that during human interaction individuals tend to adapt their behaviors mimicking the spoken style, gestures, and expressions of their conversational partners. This synchronization pattern is referred to as entrainment. This study investigates the presence of entrainment at the emotion level in cross-modality settings and its implications on multimodal emotion recognition systems. The analysis explores the relationship between acoustic features of the speaker and facial expressions of the interlocutor during dyadic interactions. The analysis shows that 72 percent of the time the speakers displayed similar emotions, indicating strong mutual influence in their expressive behaviors. We also investigate the cross-modality, cross-speaker dependence, using mutual information framework. The study reveals a strong relation between facial and acoustic features of one subject with the emotional state of the other subject. It also shows strong dependence between heterogeneous modalities across conversational partners. These findings suggest that the expressive behaviors from one dialog partner provide complementary information to recognize the emotional state of the other dialog partner. The analysis motivates classification experiments exploiting cross-modality, cross-speaker information. The study presents emotion recognition experiments using the IEMOCAP and SEMAINE databases. The results demonstrate the benefit of exploiting this emotional entrainment effect, showing statistically significant improvements.
Keywords :
audio-visual systems; emotion recognition; face recognition; psychology; speech recognition; IEMOCAP databases; SEMAINE databases; audiovisual emotion recognition; conversational partner expressions; conversational partner gestures; conversational partner spoken style; cross-modality affective reactions; cross-speaker information; dialog partner; dyadic interactions; facial features; heterogeneous modalities; human communication; human interaction; interlocutor facial expression; multimodal emotion recognition systems; mutual information framework; psycholinguistic studies; speaker acoustic features; statistically significant improvements; Acoustics; Databases; Emotion recognition; Facial features; Feature extraction; Mutual information; Speech; Entrainment; cross-subject multimodal emotion recognition; emotionally expressive speech; facial expressions; multimodal interaction;
Journal_Title :
Affective Computing, IEEE Transactions on
DOI :
10.1109/T-AFFC.2013.11