DocumentCode :
2262006
Title :
Long term on-line speaker adaptation for large vocabulary dictation
Author :
Thelen, Eric
Author_Institution :
Philips GmbH Forschungslab., Aachen, Germany
Volume :
4
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
2139
Abstract :
Online speaker adaptation is desirable for speech recognition dictation applications, because it offers the possibility to improve the system with the speaker specific data obtained from the user. Since the user will work with such a device over a long period, for a dictation system, the long term adaptation performance is more important than the adaptation speed. In contrast to speaker dependent retraining, the speaker specific speech data does not need to be stored for online speaker adaptation and each adaptation step does not require a large computational effort. We describe our way of performing online Bayesian speaker adaptation using partial traceback. We compare supervised with unsupervised adaptation and speaker adaptation with speaker dependent training using the adaptation material. Compared to the speaker independent startup models, the error rate was divided by two after five hours of supervised adaptation in our experiments. In the long term experiments, supervised online adaptation performed similar to speaker dependent training using the adaptation material
Keywords :
Bayes methods; adaptive systems; dictation; learning (artificial intelligence); office automation; speech recognition; adaptation material; adaptation speed; dictation system; error rate; large vocabulary dictation; long term adaptation performance; long term online speaker adaptation; online Bayesian speaker adaptation; online speaker adaptation; partial traceback; speaker adaptation; speaker dependent retraining; speaker dependent training; speaker independent startup models; speaker specific data; speaker specific speech data; speech recognition dictation applications; supervised adaptation; supervised online adaptation; unsupervised adaptation; Adaptation model; Bayesian methods; Equations; Error analysis; Hidden Markov models; Maximum likelihood linear regression; Probability density function; Speech recognition; Text recognition; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607226
Filename :
607226
Link To Document :
بازگشت