مرکز منطقه ای اطلاع رساني علوم و فناوري - Analysis of linear prediction residual signal, its magnitude and phase for language identification on NIST LRE (2003) database

DocumentCode :

3725789

Title :

Analysis of linear prediction residual signal, its magnitude and phase for language identification on NIST LRE (2003) database

Author :

Arup Kumar Dutta;K. Sreenivasa Rao

Author_Institution :

Indian Institute of Technology Kharagpur, Kharagpur - 721 302, India

fYear :

2015

Firstpage :

Lastpage :

Abstract :

The present work investigates the importance of excitation source features for language identification (LID). Linear prediction residual (LPR) represents the excitation source signal. By processing the LPR in sub-segmental, segmental and supra-segmental levels, we can get the language specific information present within a glottal cycle, within a sequence of a few glottal cycles and at the prosody level, respectively. The analysis has been carried out on NIST LRE (2003) speech database. Gaussian mixture model (GMM) is used to build the language models. From the experimental result, we observe that analysis at the segmental level provides the highest language specific information. We have also observed the improvement of LID accuracy using combination of source and vocal tract features compared to LID systems built using source or vocal tract features alone. This indicates the significance of source features in language identification, and the complementary role it plays along with vocal tract features.

Keywords :

"Speech","Databases","Correlation coefficient","NIST","Mel frequency cepstral coefficient","Feature extraction","Correlation"

Publisher :

ieee

Conference_Titel :

Computer, Communication and Control (IC4), 2015 International Conference on

Type :

conf

DOI :

10.1109/IC4.2015.7375718

Filename :

7375718

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3725789