مرکز منطقه ای اطلاع رساني علوم و فناوري - Prosodic feature based speech emotion recognition at segmental and supra segmental levels

DocumentCode :

2508304

Title :

Prosodic feature based speech emotion recognition at segmental and supra segmental levels

Author :

Jacob, Agnes ; Mythili, P.

Author_Institution :

Dept. of Appl. Electron. & Instrum., Gov. Eng. Coll., Kozhikode, India

fYear :

2015

fDate :

19-21 Feb. 2015

Firstpage :

Lastpage :

Abstract :

Speech emotion recognition has an increasingly significant role in human - computer interfaces as well as in the communication among human beings. This paper presents the results of investigations in emotion recognition based on the prosodic features of 1050 segmental and 1400 supra segmental speech wave files in English. The investigations were done in neutral and six basic emotions collected from ten female speakers of Indian English. The features considered in this investigation are intensity, pitch, and duration or speech rate; which were statistically analyzed. The role of each feature in emotion recognition was quantitatively assessed in terms of the classification rates of the K-Nearest Neighbor, Naive Bayes and the Artificial Neural Network classifiers. At the segmental level, all emotions could be classified, with an average emotion classification rate of 95.91%, based on the prosodic feature set, and these results were validated. The obtained results indicate saving of time and effort by the classification of emotions from minimum inputs and is therefore significant. Besides, the existence of prosody has been acknowledged at the supra segmental level only, as per available literature. At the supra segmental level, all emotions have been recognized at an average classification of 91.96%.

Keywords :

Bayes methods; emotion recognition; human computer interaction; neural nets; pattern classification; speech recognition; Indian English; artificial neural network classifiers; average emotion classification rate; female speakers; human-computer interfaces; k-nearest neighbor; naive Bayes; prosodic feature based speech emotion recognition; speech rate; speech wave files; supra segmental levels; Analysis of variance; Artificial neural networks; Databases; Emotion recognition; Feature extraction; Speech; Speech recognition; prosody; segmental and supra segmental utterances; speech emotion recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing, Informatics, Communication and Energy Systems (SPICES), 2015 IEEE International Conference on

Conference_Location :

Kozhikode

Type :

conf

DOI :

10.1109/SPICES.2015.7091377

Filename :

7091377

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2508304