مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker recognition using artificial neural networks

DocumentCode :

1706137

Title :

Speaker recognition using artificial neural networks

Author :

Mueen, Fazal ; Ahmed, Ayaz ; Sanaullah ; Gaba, Asim

Author_Institution :

Dept. of Electr. Eng., Univ. of Eng. & Technol., Lahore, Pakistan

Volume :

fYear :

2002

Firstpage :

Abstract :

We report on the application of RNN (recurrent neural net) in an open-set text-dependent speaker identification task. MFCC (Mel-frequency cepstral coefficient) features from the speech utterance are fed to a neural-network-based classifier to identify the speakers. We use a feedforward net architecture as proposed by A.J. Robinson (IEEE Trans. on Neural Networks, vol.5, no.2, 1994). We introduce a fully connected hidden layer between the input and state nodes and the output. We show that this hidden layer makes the learning of complex classification tasks more efficient. Training uses backpropagation through time. There is one output unit per speaker, with the training targets corresponding to speaker identity. For 10 male speakers, we obtain a true acceptance rate of 100% with a false acceptance rate of 10%. For 14 speakers these figures are 94% and 12% respectively. We also investigate the effect of environmental factors on the identification accuracy (signal level, change of microphone), choice of acoustic vectors (FFT or MFCC), size of the training database, inclusion of fundamental frequency. MFCC features plus fundamental frequency give the best results.

Keywords :

backpropagation; fast Fourier transforms; feedforward neural nets; pattern classification; recurrent neural nets; signal classification; speaker recognition; FFT; MFCC; Mel-frequency cepstral coefficient; RNN; acoustic vectors; artificial neural networks; backpropagation; feedforward net architecture; fundamental frequency; identification accuracy; recurrent neural net; signal level; speaker recognition; text-dependent speaker identification; training database; Artificial neural networks; Backpropagation; Cepstral analysis; Environmental factors; Loudspeakers; Mel frequency cepstral coefficient; Neural networks; Recurrent neural networks; Speaker recognition; Speech;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Students Conference, 2002. ISCON '02. Proceedings. IEEE

Print_ISBN :

0-7803-7505-X

Type :

conf

DOI :

10.1109/ISCON.2002.1215947

Filename :

1215947

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1706137