• DocumentCode
    3703411
  • Title

    An experimental study of speech emotion recognition based on deep convolutional neural networks

  • Author

    W. Q. Zheng;J. S. Yu;Y. X. Zou

  • Author_Institution
    ADSPLAB/ELIP, School of Electronic Computer Engineering, Peking University, Shenzhen, China
  • fYear
    2015
  • Firstpage
    827
  • Lastpage
    831
  • Abstract
    Speech emotion recognition (SER) is a challenging task since it is unclear what kind of features are able to reflect the characteristics of human emotion from speech. However, traditional feature extractions perform inconsistently for different emotion recognition tasks. Obviously, different spectrogram provides information reflecting difference emotion. This paper proposes a systematical approach to implement an effectively emotion recognition system based on deep convolution neural networks (DCNNs) using labeled training audio data. Specifically, the log-spectrogram is computed and the principle component analysis (PCA) technique is used to reduce the dimensionality and suppress the interferences. Then the PCA whitened spectrogram is split into non-overlapping segments. The DCNN is constructed to learn the representation of the emotion from the segments with labeled training speech data. Our preliminary experiments show the proposed emotion recognition system based on DCNNs (containing 2 convolution and 2 pooling layers) achieves about 40% classification accuracy. Moreover, it also outperforms the SVM based classification using the hand-crafted acoustic features.
  • Keywords
    "Speech","Speech recognition","Emotion recognition","Spectrogram","Feature extraction","Principal component analysis","Convolution"
  • Publisher
    ieee
  • Conference_Titel
    Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on
  • Electronic_ISBN
    2156-8111
  • Type

    conf

  • DOI
    10.1109/ACII.2015.7344669
  • Filename
    7344669