• DocumentCode
    177763
  • Title

    Variability compensation in small data: Oversampled extraction of i-vectors for the classification of depressed speech

  • Author

    Cummins, Nicholas ; Epps, Julien ; Sethu, Vidhyasaharan ; Krajewski, Jarek

  • Author_Institution
    Sch. of Electr. Eng. & Telecommun., Univ. of New South Wales, Sydney, NSW, Australia
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    970
  • Lastpage
    974
  • Abstract
    Variations in the acoustic space due to changes in speaker mental state are potentially overshadowed by variability due to speaker identity and phonetic content. Using the Audio/Visual Emotion Challenge and Workshop 2013 Depression Dataset we explore the suitability of i-vectors for reducing these latter sources of variability for distinguishing between low or high levels of speaker depression. In addition we investigate whether supervised variability compensation methods such as Linear Discriminant Analysis (LDA), and Within Class Covariance Normalisation (WCCN), applied in the i-vector domain, could be used to compensate for speaker and phonetic variability. Classification results show that i-vectors formed using an over-sampling methodology outperform a baseline set by KL-means supervectors. However the effect of these two compensation methods does not appear to improve system accuracy. Visualisations afforded by the t-Distributed Stochastic Neighbour Embedding (t-SNE) technique suggest that despite the application of these techniques, speaker variability is still a strong confounding effect.
  • Keywords
    emotion recognition; sampling methods; signal classification; speech recognition; acoustic space due; audio-visual emotion challenge; depressed speech classification; depression dataset; i-vectors extraction; linear discriminant analysis; oversampled extraction; phonetic content; phonetic variability; small data; speaker depression; speaker identity; speaker mental state; speaker variability; supervised variability compensation method; t-distributed stochastic neighbour embedding technique; within class covariance normalisation; Accuracy; Acoustics; Speech; Speech recognition; Standards; Training; Vectors; Acoustic Variability; Depression; I-vectors; Linear Discriminant Analysis; Within Class Covariance Normalisation; t-Distributed Stochastic Neighbour Embedding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6853741
  • Filename
    6853741