Title :
Static interpolation of exponential n-gram models using features of features
Author :
Sethy, Abhinav ; Chen, S. ; Ramabhadran, Bhuvana ; Vozila, Paul
Author_Institution :
IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
The best language model performance for a task is often achieved by interpolating language models built separately on corpora from multiple sources. While common practice is to use a single set of fixed interpolation weights to combine models, past work has found that gains can be had by allowing weights to vary by n-gram, when linearly interpolating word n-gram models. In this work, we investigate whether similar ideas can be used to improve log-linear interpolation for Model M, an exponential class-based n-gram model with state-of-the-art performance. We focus on log-linear interpolation as Model M´s combined via (regular) linear interpolation cannot be statically compiled into a single model, as is required for many applications due to resource constraints. We present a general parameter interpolation framework in which a weight prediction model is used to compute the interpolation weights for each n-gram. The weight prediction model takes a rich representation of n-gram features as input, and is trained to optimize the perplexity of a held-out set. In experiments on Broadcast News, we show that a mixture of experts weight prediction model yields significant perplexity and word-error rate improvements as compared to static linear interpolation.
Keywords :
interpolation; log normal distribution; natural language processing; exponential n-gram models; language models; log linear interpolation; multiple sources; static interpolation; weight prediction model; Adaptation models; Computational modeling; Data models; History; Interpolation; Predictive models; Training;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854529