DocumentCode
2023457
Title
Auditory model representation for speaker recognition
Author
Colombi, John ; Anderson, Timothy R. ; Rogers, Steven K. ; Ruck, Dennis W. ; Warhola, G.T.
Author_Institution
AFIT/EN, Wright-Patterson AFB, OH, USA
Volume
2
fYear
1993
fDate
27-30 April 1993
Firstpage
700
Abstract
An examination of the KING database that compares proven spectral processing techniques with an auditory model representation for speaker recognition is presented. The feature sets compared are LPC (linear predictive coding) cepstral coefficients and auditory nerve firing rates provided by the Payton model. The two feature sets were quantized by two clustering algorithms, a Linde-Buzo-Gray algorithm and a Kohonen self-organizing feature map. The resulting vector quantized distortion based classification indicates that the auditory model provides accuracies comparable with LPC cepstral in nonstudio quality environments and over multiple sessions. For a 10-speaker subset using only voiced frames of 15-s segments, both achieve over 80% identification rate. Cepstral performs better on verification tasks measured with receiver operating characteristics curves.<>
Keywords
hearing; linear predictive coding; physiological models; self-organising feature maps; speech recognition; vector quantisation; KING database; Kohonen self-organizing feature map; Linde-Buzo-Gray algorithm; accuracies; auditory model representation; auditory nerve firing rates; cepstral coefficients; clustering algorithms; identification rate; linear predictive coding; speaker recognition; vector quantized distortion based classification; verification;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location
Minneapolis, MN, USA
ISSN
1520-6149
Print_ISBN
0-7803-7402-9
Type
conf
DOI
10.1109/ICASSP.1993.319407
Filename
319407
Link To Document