Auditory signal processing as a basis for speaker recognition

Author

Quatíeri, T.F. ; Malyska, N. ; Sturim, D.E.

Author_Institution

Lincoln Lab., MIT, Lexington, MA, USA

fYear

2003

fDate

19-22 Oct. 2003

Firstpage

111

Lastpage

114

Abstract

We exploit models of auditory signal processing at different levels along the auditory pathway for use in speaker recognition. A low-level nonlinear model, at the cochlea, provides accentuated signal dynamics, while a high-level model, at the inferior colliculus, provides frequency analysis of modulation components that reveals an additional temporal structure. A variety of features are derived from the low-level dynamic and high-level modulation signals. Fusion of likelihood scores from feature sets at different auditory levels with scores from standard Mel-cepstral features provides an encouraging speaker recognition performance gain over use of the Mel-cepstrum alone with corpora from land-line and cellular telephone communications.

Keywords

hearing; modulation; speaker recognition; speech processing; Mel-cepstral features; accentuated signal dynamics; auditory pathway; auditory signal processing; cellular telephone communications; cochlea; frequency analysis; inferior colliculus; land-line telephone communications; likelihood scores; modulation components; speaker recognition; temporal structure; Adaptive filters; Channel bank filters; Filter bank; Finite impulse response filter; Frequency; Laboratories; Performance gain; Signal processing; Smoothing methods; Speaker recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on.

Print_ISBN

0-7803-7850-4

Type

conf

DOI

10.1109/ASPAA.2003.1285832

Filename

1285832