• DocumentCode
    3432833
  • Title

    Affective structure modeling of speech using probabilistic context free grammar for emotion recognition

  • Author

    Kun-Yi Huang ; Jia-Kuan Lin ; Yu-Hsien Chiu ; Chung-Hsien Wu

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    5286
  • Lastpage
    5290
  • Abstract
    A complete emotional expression typically contains a complex temporal course in a natural conversation. Related research on utterance-level and segment-level processing lacks understanding of the underlying structure of emotional speech. In this study, a hierarchical affective structure of an emotional utterance characterized by the probabilistic context free grammars (PCFGs) is proposed for emotion modeling. SVM-based emotion profiles are obtained and employed to segment the utterance into emotionally consistent segments. Vector quantization is applied to convert the emotion profile of each segment into codewords. A binary tree in which each node represents a codeword is constructed to characterize the affective structure of the utterance modeled by PCFG. Given an input utterance, the output emotion is determined according to the PCFG-based emotion model with the highest likelihood of the speech segments along with the score of the affective structure. For evaluation, the EMO-DB database and its expansion in utterance length were conducted. Experimental results show that the proposed method achieved emotion recognition accuracy of 87.22% for long utterances and outperformed the SVM-based method.
  • Keywords
    context-free grammars; emotion recognition; probability; support vector machines; PCFG; binary tree; codewords; complex temporal course; emotion modeling; emotion recognition; emotional expression; emotional speech; emotionally consistent segments; output emotion; probabilistic context free grammar; segment level processing; speech affective structure modeling; utterance level processing; vector quantization; Databases; Emotion recognition; Feature extraction; Hidden Markov models; Probabilistic logic; Speech; Speech recognition; Speech emotion recognition; affective structure model; probabilistic context free grammar;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178980
  • Filename
    7178980