Title :
Language model adaptation for conversational speech recognition using automatically tagged pseudo-morphological classes
Author :
Crespo, C. ; Tapias, D. ; Escalada, G. ; Álvarez, J.
Author_Institution :
Speech Technol. Group, Telefonica Investigacion y Desarrollo, Madrid, Spain
Abstract :
Statistical language models provide a powerful tool for modelling natural spoken language. Nevertheless a large set of training sentences is required to estimate reliably the model parameters. The authors present a method for estimating n-gram probabilities from sparse data. The proposed language modeling strategy allows one to adapt a generic language model (LM) to a new semantic domain with just a few hundred sentences. This reduced set of sentences is automatically tagged with eighty different pseudo-morphological labels, and then a word-bigram LM is derived from them. Finally, this target domain word-bigram LM is interpolated with a generic back-off word-bigram LM, which was estimated using a large text database. This strategy reduces by 27% the word error rate of the SPATIS (SPanish ATIS) task
Keywords :
interpolation; modelling; parameter estimation; probability; speech recognition; SPATIS task; automatically tagged pseudo-morphological classes; conversational speech recognition; generic back-off word-bigram language model; generic language model; interpolation; language model adaptation; large text database; n-gram probability estimation; natural spoken language model; parameter estimation; pseudo-morphological labels; sparse data; statistical language models; training sentences; word error rate; word-bigram language model; Adaptation model; Databases; Error analysis; Maximum likelihood estimation; Natural languages; Parameter estimation; Probability; Smoothing methods; Speech recognition; Speech synthesis;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
Print_ISBN :
0-8186-7919-0
DOI :
10.1109/ICASSP.1997.596058