• DocumentCode
    3716199
  • Title

    Dynamic speech emotion recognition with state-space models

  • Author

    Konstantin Markov;Tomoko Matsui;Francois Septier;Gareth Peters

  • Author_Institution
    The University of Aizu, Japan
  • fYear
    2015
  • Firstpage
    2077
  • Lastpage
    2081
  • Abstract
    Automatic emotion recognition from speech has been focused mainly on identifying categorical or static affect states, but the spectrum of human emotion is continuous and time-varying. In this paper, we present a recognition system for dynamic speech emotion based on state-space models (SSMs). The prediction of the unknown emotion trajectory in the affect space spanned by Arousal, Valence, and Dominance (A-V-D) descriptors is cast as a time series filtering task. The state space models we investigated include a standard linear model (Kalman filter) as well as novel non-linear, non-parametric Gaussian Processes (GP) based SSM. We use the AVEC 2014 database for evaluation, which provides ground truth A-V-D labels which allows state and measurement functions to be learned separately simplifying the model training. For the filtering with GP SSM, we used two approximation methods: a recently proposed analytic method and Particle filter. All models were evaluated in terms of average Pearson correlation R and root mean square error (RMSE). The results show that using the same feature vectors, the GP SSMs achieve twice higher correlation and twice smaller RMSE than a Kalman filter.
  • Keywords
    "Speech","Speech recognition","Emotion recognition","Kalman filters","Approximation methods","State-space methods","Gaussian processes"
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference (EUSIPCO), 2015 23rd European
  • Electronic_ISBN
    2076-1465
  • Type

    conf

  • DOI
    10.1109/EUSIPCO.2015.7362750
  • Filename
    7362750