• DocumentCode
    32010
  • Title

    Time-Frequency Feature and AMS-GMM Mask for Acoustic Emotion Classification

  • Author

    Zao, L. ; Cavalcante, D. ; Coelho, Rui

  • Author_Institution
    Grad. Program in Defense Eng., Mil. Inst. of Eng. (IME), Rio de Janeiro, Brazil
  • Volume
    21
  • Issue
    5
  • fYear
    2014
  • fDate
    May-14
  • Firstpage
    620
  • Lastpage
    624
  • Abstract
    In this letter, the pH time-frequency vocal source feature is proposed for multistyle emotion identification. A binary acoustic mask is also used to improve the emotion classification accuracy. Emotional and stress conditions from the Berlin Database of Emotional Speech (EMO-DB) and Speech under Simulated and Actual Stress (SUSAS) databases are investigated in the experiments. In terms of emotion identification rates, the pH outperforms the mel-frequency cepstral coefficients (MFCC) and a Teager-Energy-Operator (TEO) based feature. Moreover, the acoustic mask achieves accuracy improvement for both the MFCC and the pH feature.
  • Keywords
    Gaussian processes; amplitude modulation; emotion recognition; signal classification; time-frequency analysis; AMS-GMM mask; Berlin database of emotional speech database; EMO-DB database; Gaussian mixture models; MFCC; SUSAS database; acoustic emotion classification; amplitude modulation spectrogram; binary acoustic mask; mel-frequency cepstral coefficients; multistyle emotion identification; pH time-frequency vocal source feature; speech under simulated and actual stress databases; teager energy operator; Acoustics; Databases; Discrete wavelet transforms; Feature extraction; Speech; Time-frequency analysis; Vectors; Binary acoustic mask; Hurst exponent; pH feature; speech emotion recognition;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2014.2311435
  • Filename
    6766238