DocumentCode :
2403942
Title :
Speaker-dependent speech recognition based on phone-like units models-application to voice dialling
Author :
Fontaine, Vincent ; Bourlard, Herué
Author_Institution :
Faculte Polytech., Mons, Belgium
Volume :
2
fYear :
1997
fDate :
21-24 Apr 1997
Firstpage :
1527
Abstract :
This paper presents a speaker dependent speech recognition with application to voice dialling. This work has been developed under the constraints imposed by voice dialling applications, i.e., low memory requirements and limited training material. Two methods for producing speaker dependent word baseforms based on phone-like units (PLU) are presented and compared: (1) a classical vector quantizer is used to divide the space into regions associated with PLUs; (2) a speaker independent hybrid HMM/MLP recognizer is used to generate speaker dependent PLU based models. This work shows that very low error rates can be achieved even with very simple systems, namely a DTW-based recognizer. However, the best results are achieved when using the hybrid HMM/MLP system to generate the word baseforms. Finally, a real-time demonstration simulating voice dialling functions and including keyword spotting and rejection capabilities has been set up and can be tested online
Keywords :
acoustic signal processing; hidden Markov models; multilayer perceptrons; speech coding; speech processing; speech recognition; telephony; vector quantisation; voice communication; DTW based recognizer; VQ; acoustic features; keyword rejection; keyword spotting; limited training material; low memory requirements; multilayer perceptron; online testing; phone like units models; real-time demonstration; speaker dependent speech recognition; speaker dependent word baseforms; speaker independent hybrid HMM/MLP recognizer; vector quantizer; very low error rates; voice dialling; voice dialling functions simulation; Acoustic testing; Automatic speech recognition; Cepstral analysis; Error analysis; Hidden Markov models; Hybrid power systems; Loudspeakers; Robustness; Spatial databases; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
ISSN :
1520-6149
Print_ISBN :
0-8186-7919-0
Type :
conf
DOI :
10.1109/ICASSP.1997.596241
Filename :
596241
Link To Document :
بازگشت