مرکز منطقه ای اطلاع رساني علوم و فناوري - Improving language models by using distant information

DocumentCode :

1988260

Title :

Improving language models by using distant information

Author :

Brun, A. ; Langlois, D. ; Smaili, K.

Author_Institution :

Univ. Nancy 2, Nancy

fYear :

2007

fDate :

12-15 Feb. 2007

Firstpage :

Lastpage :

Abstract :

This study examines how to take originally advantage from distant information in statistical language models. We show that it is possible to use n-gram models considering histories different from those used during training. These models are called crossing context models. Our study deals with classical and distant n-gram models. A mixture of four models is proposed and evaluated. A bigram linear mixture achieves an improvement of 14% in terms of perplexity. Moreover the trigram mixture outperforms the standard trigram by 5.6%. These improvements have been obtained without complexifying standard n-gram models. The resulting mixture language model has been integrated into a speech recognition system. Its evaluation achieves a slight improvement in terms of word error rate on the data used for the francophone evaluation campaign ESTER [1]. Finally, the impact of the proposed crossing context language models on performance is presented according to various speakers.

Keywords :

speech recognition; statistical analysis; bigram linear mixture; context language models; distant information; distant n-gram models; francophone evaluation; speech recognition system; trigram mixture; word error rate; Context modeling; Error analysis; History; Natural languages; Neodymium; Speech recognition; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on

Conference_Location :

Sharjah

Print_ISBN :

978-1-4244-0778-1

Electronic_ISBN :

978-1-4244-1779-8

Type :

conf

DOI :

10.1109/ISSPA.2007.4555480

Filename :

4555480

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1988260