مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2973360

Title :

Scaling shrinkage-based language models

Author :

Chen, Stanley F. ; Mangu, Lidia ; Ramabhadran, Bhuvana ; Sarikaya, Ruhi ; Sethy, Abhinav

Author_Institution :

IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2009

fDate :

Nov. 13 2009-Dec. 17 2009

Firstpage :

299

Lastpage :

304

Abstract :

In we show that a novel class-based language model, Model M, and the method of regularized minimum discrimination information (rMDI) models outperform comparable methods on moderate amounts of Wall Street Journal data. Both of these methods are motivated by the observation that shrinking the sum of parameter magnitudes in an exponential language model tends to improve performance. In this paper, we investigate whether these shrinkage-based techniques also perform well on larger training sets and on other domains. First, we explain why good performance on large data sets is uncertain, by showing that gains relative to a baseline n-gram model tend to decrease as training set size increases. Next, we evaluate several methods for data/model combination with Model M and rMDI models on limited-scale domains, to uncover which techniques should work best on large domains. Finally, we apply these methods on a variety of medium-to-large-scale domains covering several languages, and show that Model M consistently provides significant gains over existing language models for state-of-the-art systems in both speech recognition and machine translation.

Keywords :

language translation; natural language processing; speech recognition; language model; machine translation; regularized minimum discrimination information models; shrinkage-based techniques; speech recognition; Acoustic testing; Automatic speech recognition; Interpolation; Large-scale systems; Natural languages; Performance gain; Predictive models; Speech recognition; Training data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location :

Merano

Print_ISBN :

978-1-4244-5478-5

Electronic_ISBN :

978-1-4244-5479-2

Type :

conf

DOI :

10.1109/ASRU.2009.5373380

Filename :

5373380

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2973360