Title :
Language model adaptation via minimum discrimination information
Author :
Rao, P. Srinivasa ; Monkowski, Michael D. ; Roukos, Salim
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
Statistical language models improve the performance of speech recognition systems by providing estimates of a priori probabilities of word sequences. The commonly used trigram language models obtain the conditional probability estimate of a word given the previous two words, from a large corpus of text. The text corpus is often a collection of several small diverse segments such as newspaper articles, or conversations on different topics. Knowledge of the current topic could be utilized to adapt the general trigram language models to match that topic closely. For example, an interpolation of the general language model with one built on the topic data could be used. The authors first discuss the adaptation of general trigram language models to a known topic using the minimum discrimination information (MDI) method. They then present results on the switchboard corpus which consists of telephone conversations on several topics
Keywords :
interpolation; minimisation; natural languages; probability; speech recognition; a priori probabilities; conditional probability estimate; conversations; interpolation; language model adaptation; minimum discrimination information; newspaper articles; speech recognition systems; statistical language models; switchboard corpus; telephone conversations; text corpus; trigram language models; word sequences; Adaptation model; Constraint optimization; Entropy; Lagrangian functions; Natural languages; Probability; Speech recognition; Statistical distributions; Telephony; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
Print_ISBN :
0-7803-2431-5
DOI :
10.1109/ICASSP.1995.479389