مرکز منطقه ای اطلاع رساني علوم و فناوري - Analysing the performance of speaker identification task using different short term and long term features

DocumentCode :

238044

Title :

Analysing the performance of speaker identification task using different short term and long term features

Author :

Suba, P. ; Bharathi, B.

Author_Institution :

Dept. of Comput. Sci. & Eng., SSN Coll. of Eng., Chennai, India

fYear :

2014

fDate :

8-10 May 2014

Firstpage :

1451

Lastpage :

1456

Abstract :

The Automatic Speaker Recognition (ASR) is to identify information about the particular speaker identification. The actual goal is to possess machine automatically to recognize a person or perhaps to authenticate a persons claimed identity through his/her speech. This paper proposes the speaker identification task using different short term and long term features. The short term features are extracted based on frames. This represents the characteristics of speech signal with reduced redundancy. In training phase, various short-term features such as Mel Frequency Cepstral Coefficient(MFCC), Linear Predictive Cepstral Coefficient(LPCC), Perceptual Linear Predictive(PLP) extracted and modeled using Gaussian Mixture Models(GMM). The long term features like prosody are used to identify the speaking behavior. The long term features are often obtained on portions of speech signal longer than one frame. Long term feature are extracted from the speech signal and trained using Gaussian mixture models. The different short term and long term features are extracted separately and the combination of them are also extracted and modeled using Gaussian Mixture Models(GMM) to get the target model. In testing phase, the features are extracted from the given test speech signal at different duration of time. This extracted features are given to the stated speaker design and the decisions are obtained. Finally, the overall performance are examined according to the combination of short-term and long term-features.

Keywords :

Gaussian processes; feature extraction; mixture models; speaker recognition; ASR; GMM; Gaussian mixture models; LPCC; MFCC; Mel frequency cepstral coefficient; PLP; automatic speaker recognition; different long term features; different short term features; feature extraction; linear predictive cepstral coefficient; perceptual linear predictive; speaker identification performance analysis; speech signal characteristics; Feature extraction; MATLAB; Mel frequency cepstral coefficient; Signal resolution; Speech; GMM; LPCC; MFCC; PLP; Prosody;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advanced Communication Control and Computing Technologies (ICACCCT), 2014 International Conference on

Conference_Location :

Ramanathapuram

Print_ISBN :

978-1-4799-3913-8

Type :

conf

DOI :

10.1109/ICACCCT.2014.7019342

Filename :

7019342

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=238044