DocumentCode :
2311736
Title :
An incremental speaker-adaptation technique for hybrid HMM-MLP recognizer
Author :
Neto, Joao P. ; Martins, Ciro ; Almeida, Luís B.
Author_Institution :
INESC, Inst. Superior Tecnico, Lisbon, Portugal
Volume :
3
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
1293
Abstract :
One of the problems of speaker-independent continuous speech recognition systems is their inability to cope with the inter-speaker variability. When we find test speakers with different characteristics from the ones presented in the training pool we observe a large degradation on the system performance. To overcome this problem speaker-adaptation techniques may be used to provide near speaker-dependent accuracy. In this work we present a speaker-adaptation technique applied to a hybrid HMM-MLP system for large vocabulary, continuous speech recognition. This technique is based on an architecture that employs a trainable linear input network (LIN) to map the speaker specific features input vectors to the speaker-independent system. This speaker-adaptation technique is evaluated in an incremental speaker-adaptation task using a Wall Street Journal (WSJ) database. Both supervised and unsupervised modes are evaluated. The results show that speaker-adaptation within the hybrid framework can substantially improve system performance
Keywords :
feedforward neural nets; hidden Markov models; learning (artificial intelligence); multilayer perceptrons; neural net architecture; performance evaluation; speech recognition; vocabulary; Wall Street Journal database; continuous speech recognition systems; hidden Markov model; hybrid HMM-MLP recognizer; incremental speaker-adaptation technique; input vectors; inter-speaker variability; large vocabulary; multilayer perceptron; neural network architecture; performance; speaker-dependent accuracy; speaker-independent recognition; supervised learning; trainable linear input network; training; unsupervised learning; Degradation; Hidden Markov models; Markov processes; Multilayer perceptrons; Spatial databases; Speech recognition; System performance; System testing; Training data; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607849
Filename :
607849
Link To Document :
بازگشت