DocumentCode :
1706137
Title :
Speaker recognition using artificial neural networks
Author :
Mueen, Fazal ; Ahmed, Ayaz ; Sanaullah ; Gaba, Asim
Author_Institution :
Dept. of Electr. Eng., Univ. of Eng. & Technol., Lahore, Pakistan
Volume :
1
fYear :
2002
Firstpage :
99
Abstract :
We report on the application of RNN (recurrent neural net) in an open-set text-dependent speaker identification task. MFCC (Mel-frequency cepstral coefficient) features from the speech utterance are fed to a neural-network-based classifier to identify the speakers. We use a feedforward net architecture as proposed by A.J. Robinson (IEEE Trans. on Neural Networks, vol.5, no.2, 1994). We introduce a fully connected hidden layer between the input and state nodes and the output. We show that this hidden layer makes the learning of complex classification tasks more efficient. Training uses backpropagation through time. There is one output unit per speaker, with the training targets corresponding to speaker identity. For 10 male speakers, we obtain a true acceptance rate of 100% with a false acceptance rate of 10%. For 14 speakers these figures are 94% and 12% respectively. We also investigate the effect of environmental factors on the identification accuracy (signal level, change of microphone), choice of acoustic vectors (FFT or MFCC), size of the training database, inclusion of fundamental frequency. MFCC features plus fundamental frequency give the best results.
Keywords :
backpropagation; fast Fourier transforms; feedforward neural nets; pattern classification; recurrent neural nets; signal classification; speaker recognition; FFT; MFCC; Mel-frequency cepstral coefficient; RNN; acoustic vectors; artificial neural networks; backpropagation; feedforward net architecture; fundamental frequency; identification accuracy; recurrent neural net; signal level; speaker recognition; text-dependent speaker identification; training database; Artificial neural networks; Backpropagation; Cepstral analysis; Environmental factors; Loudspeakers; Mel frequency cepstral coefficient; Neural networks; Recurrent neural networks; Speaker recognition; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Students Conference, 2002. ISCON '02. Proceedings. IEEE
Print_ISBN :
0-7803-7505-X
Type :
conf
DOI :
10.1109/ISCON.2002.1215947
Filename :
1215947
Link To Document :
بازگشت