DocumentCode :
1988260
Title :
Improving language models by using distant information
Author :
Brun, A. ; Langlois, D. ; Smaili, K.
Author_Institution :
Univ. Nancy 2, Nancy
fYear :
2007
fDate :
12-15 Feb. 2007
Firstpage :
1
Lastpage :
4
Abstract :
This study examines how to take originally advantage from distant information in statistical language models. We show that it is possible to use n-gram models considering histories different from those used during training. These models are called crossing context models. Our study deals with classical and distant n-gram models. A mixture of four models is proposed and evaluated. A bigram linear mixture achieves an improvement of 14% in terms of perplexity. Moreover the trigram mixture outperforms the standard trigram by 5.6%. These improvements have been obtained without complexifying standard n-gram models. The resulting mixture language model has been integrated into a speech recognition system. Its evaluation achieves a slight improvement in terms of word error rate on the data used for the francophone evaluation campaign ESTER [1]. Finally, the impact of the proposed crossing context language models on performance is presented according to various speakers.
Keywords :
speech recognition; statistical analysis; bigram linear mixture; context language models; distant information; distant n-gram models; francophone evaluation; speech recognition system; trigram mixture; word error rate; Context modeling; Error analysis; History; Natural languages; Neodymium; Speech recognition; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on
Conference_Location :
Sharjah
Print_ISBN :
978-1-4244-0778-1
Electronic_ISBN :
978-1-4244-1779-8
Type :
conf
DOI :
10.1109/ISSPA.2007.4555480
Filename :
4555480
Link To Document :
بازگشت