Text-prompted speaker verification experiments with phoneme specific MLPs

Author

Delacrétaz, Dijana Petrovska ; Hennebert, Jean

Author_Institution

Swiss Fed. Inst. of Technol., Switzerland

Volume

2

fYear

1998

fDate

12-15 May 1998

Firstpage

777

Abstract

The aims of the study described in this paper are (1) to assess the relative speaker discriminant properties of phonemes and (2) to investigate the importance of the temporal frame-to-frame information for speaker modelling in the framework of a text-prompted speaker verification system using hidden Markov models (HMMs) and multilayer perceptrons (MLPs). It is shown that, with similar experimental conditions, nasals, fricatives and vowels convey more speaker specific information than plosives and liquids. Regarding the influence of the frame-to-frame temporal information, significant improvements are reported from the inclusion of several acoustic frames at the input of the MLPs. The results tend also to show that each phoneme has its optimal MLP context size giving the best equal error rate (EER)

Keywords

acoustic signal processing; error statistics; hidden Markov models; multilayer perceptrons; speaker recognition; speech processing; HMM; acoustic frames; equal error rate; fricatives; hidden Markov models; liquids; multilayer perceptrons; nasals; optimal MLP context size; phoneme specific MLP; plosives; speaker discriminant properties; speaker modelling; temporal frame-to-frame information; text-prompted speaker verification experiments; vowels; Circuits and systems; Error analysis; Hidden Markov models; Liquids; Loudspeakers; Security; Speech coding; Speech recognition; Text recognition; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.675380

Filename

675380