Speaker-dependent speech recognition based on phone-like units models-application to voice dialling

Author

Fontaine, Vincent ; Bourlard, Herué

Author_Institution

Faculte Polytech., Mons, Belgium

Volume

2

fYear

1997

fDate

21-24 Apr 1997

Firstpage

1527

Abstract

This paper presents a speaker dependent speech recognition with application to voice dialling. This work has been developed under the constraints imposed by voice dialling applications, i.e., low memory requirements and limited training material. Two methods for producing speaker dependent word baseforms based on phone-like units (PLU) are presented and compared: (1) a classical vector quantizer is used to divide the space into regions associated with PLUs; (2) a speaker independent hybrid HMM/MLP recognizer is used to generate speaker dependent PLU based models. This work shows that very low error rates can be achieved even with very simple systems, namely a DTW-based recognizer. However, the best results are achieved when using the hybrid HMM/MLP system to generate the word baseforms. Finally, a real-time demonstration simulating voice dialling functions and including keyword spotting and rejection capabilities has been set up and can be tested online

Keywords

acoustic signal processing; hidden Markov models; multilayer perceptrons; speech coding; speech processing; speech recognition; telephony; vector quantisation; voice communication; DTW based recognizer; VQ; acoustic features; keyword rejection; keyword spotting; limited training material; low memory requirements; multilayer perceptron; online testing; phone like units models; real-time demonstration; speaker dependent speech recognition; speaker dependent word baseforms; speaker independent hybrid HMM/MLP recognizer; vector quantizer; very low error rates; voice dialling; voice dialling functions simulation; Acoustic testing; Automatic speech recognition; Cepstral analysis; Error analysis; Hidden Markov models; Hybrid power systems; Loudspeakers; Robustness; Spatial databases; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location

Munich

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.596241

Filename

596241