• DocumentCode
    1835971
  • Title

    Vocal emotion recognition in five languages of Assam using features based on MFCCs and Eigen Values of Autocorrelation Matrix in presence of babble noise

  • Author

    Kandali, Aditya Bihar ; Routray, Aurobinda ; Basu, Tapan Kumar

  • Author_Institution
    Dept. of Electr. Eng., Indian Inst. of Technol. Kharagpur, Kharagpur, India
  • fYear
    2010
  • fDate
    29-31 Jan. 2010
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    This work investigates whether vocal emotion expressions of (i) discrete emotion be distinguished from `no-emotion´ (i.e. neutral), (ii) one discrete emotion be distinguished from another, (iii) surprise, which is actually a cognitive component that could be present with any emotion, be also recognized as distinct emotion, (iv) discrete emotion be recognized cross-lingually. This study will enable us to get more information regarding nature and function of emotion. Furthermore, this work will help in developing a generalized vocal emotion recognition system, which will increase the efficiency of human-machine interaction systems. In this work, an emotional speech database consisting of short sentences of six full-blown basic emotions and neutral is created with 140 simulated utterances per speaker of five native languages of Assam. This database is validated by a Listening Test. A new feature set is proposed based on Eigen Values of Autocorrelation Matrix (EVAM) of each frame of the speech signal. The Gaussian Mixture Model (GMM) is used as classifier. The performance of the proposed feature set is compared with Mel Frequency Cepstral Coefficients (MFCCs) at sampling frequency of 8.1 kHz and with additive babble noise of 5 db and 0 db Signal-to-Noise Ratios (SNRs) under matched noise training and testing condition.
  • Keywords
    Gaussian processes; eigenvalues and eigenfunctions; emotion recognition; man-machine systems; matrix algebra; natural language processing; noise; signal classification; speech recognition; user interfaces; Assam; Gaussian mixture model; Mel frequency cepstral coefficients; additive babble noise; autocorrelation matrix; eigen values; emotional speech database; frequency 8.1 kHz; generalized vocal emotion recognition system; human-machine interaction system; listening test; noise figure 5 dB; Additive noise; Autocorrelation; Emotion recognition; Man machine systems; Mel frequency cepstral coefficient; Natural languages; Signal to noise ratio; Spatial databases; Speech; Testing; Eigen Values of Autocorrelation Matrix; Full-blown Basic Emotion; GMM; MFCC; Vocal Emotion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications (NCC), 2010 National Conference on
  • Conference_Location
    Chennai
  • Print_ISBN
    978-1-4244-6383-1
  • Type

    conf

  • DOI
    10.1109/NCC.2010.5430205
  • Filename
    5430205