• DocumentCode
    3167236
  • Title

    Speaker variability in emotion recognition - an adaptation based approach

  • Author

    Ding, Ni ; Sethu, Vidhyasaharan ; Epps, Julien ; Ambikairajah, Eliathamby

  • Author_Institution
    Sch. of Electr. Eng. & Telecommun., Univ. of New South Wales, Sydney, NSW, Australia
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    5101
  • Lastpage
    5104
  • Abstract
    None of the features commonly utilised in automatic emotion classification systems completely disassociate emotion-specific information from speaker-specific information. Consequently, this speaker-specific variability adversely affects the performance of the emotion classification system and in existing systems is frequently mitigated by some form of speaker normalisation. Speaker adaptation offers an alternative to normalisation and this paper proposes a novel bootstrapping technique which involves selecting appropriate initial models from a large training pool, prior to speaker adaptation of emotion models in the context of GMM based emotion classification as an alternative to speaker normalisation. Evaluations on the LDC Emotional Prosody and the FAU Aibo corpora reveal that an emotion classification system based on the proposed bootstrapping method outperforms systems based on speaker normalisation as long as a small amount of labelled adaptation data is available. It also outperforms speaker adaption from common initial models estimated from all training speakers.
  • Keywords
    Gaussian processes; emotion recognition; speaker recognition; FAU Aibo corpora; GMM; LDC emotional prosody; automatic emotion classification systems; bootstrapping technique; emotion recognition; emotion-specific information disassociation; labelled adaptation data; speaker adaptation; speaker normalisation; speaker-specific variability; training pool; Accuracy; Adaptation models; Data models; Emotion recognition; Feature extraction; Speech; Training; Speaker adaptation; bootstrapping; emotion classification; speaker normalisation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6289068
  • Filename
    6289068