مرکز منطقه ای اطلاع رساني علوم و فناوري - A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

DocumentCode :

1864154

Title :

A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

Author :

Ahmad, Khan Suhail ; Thosar, Anil S. ; Nirmal, Jagannath H. ; Pande, Vinay S.

Author_Institution :

Dept. of Electron., K.J. Somaiya Coll. of Eng., Mumbai, India

fYear :

2015

fDate :

4-7 Jan. 2015

Firstpage :

Lastpage :

Abstract :

This paper motivates the use of combination of mel frequency cepstral coefficients (MFCC) and its delta derivatives (DMFCC and DDMFCC) calculated using mel spaced Gaussian filter banks for text independent speaker recognition. MFCC modeled on the human auditory system shows robustness against noise and session changes and hence has become synonymous with speaker recognition. Our main aim is to test the accuracy of our proposed feature set for different values of frame overlap and MFCC feature vector sizes to identify the system having highest accuracy. Principal component analysis (PCA) is applied before the training and testing stages for feature dimensionality reduction thereby increasing computing speed and puts low constraint on the memory required for processing. The use of probabilistic neural network (PNN) in the modeling domain provided the advantages of achieving lower operational times during the training stages. The experiments examined the percentage identification accuracy (PIA) of MFCC, combination of MFCC and DMFCC as well as combination of all three feature sets MFCC, DMFCC and DDMFCC. The proposed feature set attains an identification accuracy of 94% for frame overlap of 90% and MFCC feature size of 18 coefficients. It outperforms the identification rates of the other two feature sets. These speaker recognition experiments were tested using the Voxforge database.

Keywords :

Gaussian processes; cepstral analysis; channel bank filters; learning (artificial intelligence); neural nets; principal component analysis; probability; speaker recognition; DDMFCC feature set; MFCC feature size; MFCC feature vector sizes; PIA; PNN; Voxforge database; delta derivatives; feature dimensionality reduction; frame overlap; human auditory system; identification accuracy; mel frequency cepstral coefficients; mel spaced Gaussian filter banks; memory constraint; modeling domain; noise change; operational times; percentage identification accuracy; principal component analysis; probabilistic neural network; session change; speaker recognition; testing stage; text independent speaker recognition; training stage; Accuracy; Feature extraction; Filter banks; Hidden Markov models; Mel frequency cepstral coefficient; Speaker recognition; Speech; Delta Derivatives; MFCC; Mel Spaced Gaussian Filter Banks; Percentage Identification Accuracy; Principal Component Analysis; Probabilistic Neural Networks; speaker recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advances in Pattern Recognition (ICAPR), 2015 Eighth International Conference on

Conference_Location :

Kolkata

Type :

conf

DOI :

10.1109/ICAPR.2015.7050669

Filename :

7050669

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1864154