DocumentCode
62163
Title
Exploiting Psychological Factors for Interaction Style Recognition in Spoken Conversation
Author
Wen-Li Wei ; Chung-Hsien Wu ; Jen-Chun Lin ; Han Li
Author_Institution
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
Volume
22
Issue
3
fYear
2014
fDate
Mar-14
Firstpage
659
Lastpage
671
Abstract
Determining how a speaker is engaged in a conversation is crucial for achieving harmonious interaction between computers and humans. In this study, a fusion approach was developed based on psychological factors to recognize Interaction Style ( IS) in spoken conversation, which plays a key role in creating natural dialogue agents. The proposed Fused Cross-Correlation Model (FCCM) provides a unified probabilistic framework to model the relationships among the psychological factors of emotion, personality trait ( PT), transient IS, and IS history, for recognizing IS. An emotional arousal-dependent speech recognizer was used to obtain the recognized spoken text for extracting linguistic features to estimate transient IS likelihood and recognize PT. A temporal course modeling approach and an emotional sub-state language model, based on the temporal phases of an emotional expression, were employed to obtain a better emotion recognition result. The experimental results indicate that the proposed FCCM yields satisfactory results in IS recognition and also demonstrate that combining psychological factors effectively improves IS recognition accuracy.
Keywords
correlation methods; emotion recognition; feature extraction; probability; speaker recognition; FCCM; emotion recognition; emotional arousal-dependent speech recognizer; emotional sub-state language model; fused cross-correlation model; fusion approach; harmonious interaction; interaction style recognition; linguistic feature extraction; natural dialogue agents; psychological factors; spoken conversation; temporal course modeling approach; temporal phases; transient IS likelihood; unified probabilistic framework; Accuracy; Emotion recognition; Feature extraction; Psychology; Speech; Speech recognition; Text recognition; Emotion; interaction style; language model; personality trait; temporal phase;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher
ieee
ISSN
2329-9290
Type
jour
DOI
10.1109/TASLP.2014.2300339
Filename
6714384
Link To Document