DocumentCode :
238044
Title :
Analysing the performance of speaker identification task using different short term and long term features
Author :
Suba, P. ; Bharathi, B.
Author_Institution :
Dept. of Comput. Sci. & Eng., SSN Coll. of Eng., Chennai, India
fYear :
2014
fDate :
8-10 May 2014
Firstpage :
1451
Lastpage :
1456
Abstract :
The Automatic Speaker Recognition (ASR) is to identify information about the particular speaker identification. The actual goal is to possess machine automatically to recognize a person or perhaps to authenticate a persons claimed identity through his/her speech. This paper proposes the speaker identification task using different short term and long term features. The short term features are extracted based on frames. This represents the characteristics of speech signal with reduced redundancy. In training phase, various short-term features such as Mel Frequency Cepstral Coefficient(MFCC), Linear Predictive Cepstral Coefficient(LPCC), Perceptual Linear Predictive(PLP) extracted and modeled using Gaussian Mixture Models(GMM). The long term features like prosody are used to identify the speaking behavior. The long term features are often obtained on portions of speech signal longer than one frame. Long term feature are extracted from the speech signal and trained using Gaussian mixture models. The different short term and long term features are extracted separately and the combination of them are also extracted and modeled using Gaussian Mixture Models(GMM) to get the target model. In testing phase, the features are extracted from the given test speech signal at different duration of time. This extracted features are given to the stated speaker design and the decisions are obtained. Finally, the overall performance are examined according to the combination of short-term and long term-features.
Keywords :
Gaussian processes; feature extraction; mixture models; speaker recognition; ASR; GMM; Gaussian mixture models; LPCC; MFCC; Mel frequency cepstral coefficient; PLP; automatic speaker recognition; different long term features; different short term features; feature extraction; linear predictive cepstral coefficient; perceptual linear predictive; speaker identification performance analysis; speech signal characteristics; Feature extraction; MATLAB; Mel frequency cepstral coefficient; Signal resolution; Speech; GMM; LPCC; MFCC; PLP; Prosody;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Communication Control and Computing Technologies (ICACCCT), 2014 International Conference on
Conference_Location :
Ramanathapuram
Print_ISBN :
978-1-4799-3913-8
Type :
conf
DOI :
10.1109/ICACCCT.2014.7019342
Filename :
7019342
Link To Document :
بازگشت