DocumentCode :
3725789
Title :
Analysis of linear prediction residual signal, its magnitude and phase for language identification on NIST LRE (2003) database
Author :
Arup Kumar Dutta;K. Sreenivasa Rao
Author_Institution :
Indian Institute of Technology Kharagpur, Kharagpur - 721 302, India
fYear :
2015
Firstpage :
1
Lastpage :
4
Abstract :
The present work investigates the importance of excitation source features for language identification (LID). Linear prediction residual (LPR) represents the excitation source signal. By processing the LPR in sub-segmental, segmental and supra-segmental levels, we can get the language specific information present within a glottal cycle, within a sequence of a few glottal cycles and at the prosody level, respectively. The analysis has been carried out on NIST LRE (2003) speech database. Gaussian mixture model (GMM) is used to build the language models. From the experimental result, we observe that analysis at the segmental level provides the highest language specific information. We have also observed the improvement of LID accuracy using combination of source and vocal tract features compared to LID systems built using source or vocal tract features alone. This indicates the significance of source features in language identification, and the complementary role it plays along with vocal tract features.
Keywords :
"Speech","Databases","Correlation coefficient","NIST","Mel frequency cepstral coefficient","Feature extraction","Correlation"
Publisher :
ieee
Conference_Titel :
Computer, Communication and Control (IC4), 2015 International Conference on
Type :
conf
DOI :
10.1109/IC4.2015.7375718
Filename :
7375718
Link To Document :
بازگشت