A hybrid parameterization technique for Speaker Identification

Author

Gomez, P. ; Alvarez, A. ; Mazaira, L.M. ; Fernandez, R. ; Nieto, V. ; Martinez, R. ; Munoz, C. ; Rodellar, V.

Author_Institution

GIAPSI, Univ. Politec. de Madrid, Boadilla del Monte, Spain

fYear

2008

fDate

25-29 Aug. 2008

Firstpage

1

Lastpage

5

Abstract

Classical parameterization techniques for Speaker Identification use the codification of the power spectral density of raw speech, not discriminating between articulatory features produced by vocal tract dynamics (acoustic-phonetics) from glottal source biometry. Through the present paper a study is conducted to separate voicing fragments of speech into vocal and glottal components, dominated respectively by the vocal tract transfer function estimated adaptively to track the acoustic-phonetic sequence of the message, and by the glottal characteristics of the speaker and the phonation gesture. The separation methodology is based in Joint Process Estimation under the uncorrelation hypothesis between vocal and glottal spectral distributions. Its application on voiced speech is presented in the time and frequency domains. The parameterization methodology is also described. Speaker Identification experiments conducted on 245 speakers are shown comparing different parameterization strategies. The results confirm the better performance of decoupled parameterization compared against approaches based on plain speech parameterization.

Keywords

estimation theory; frequency-domain analysis; speaker recognition; time-domain analysis; transfer functions; acoustic-phonetic sequence; articulatory features; codification; decoupled parameterization; frequency domains; glottal characteristics; glottal source biometry; glottal spectral distributions; joint process estimation; parameterization strategies; phonation gesture; plain speech parameterization; power spectral density; speaker identification; time domains; uncorrelation hypothesis; vocal spectral distributions; vocal tract dynamics; vocal tract transfer function; voiced speech; voicing fragments; Estimation; Europe; Filtering; Mel frequency cepstral coefficient; Speech; Speech processing;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing Conference, 2008 16th European

Conference_Location

Lausanne

ISSN

2219-5491

Type

conf

Filename

7080538