DocumentCode
730847
Title
Unnormalized exponential and neural network language models
Author
Sethy, Abhinav ; Chen, Stanley ; Arisoy, Ebru ; Ramabhadran, Bhuvana
Author_Institution
IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
5416
Lastpage
5420
Abstract
Model M, an exponential class-based language model, and neural network language models (NNLM´s) have outperformed word n-gram language models over a wide range of tasks. However, these gains come at the cost of vastly increased computation when calculating word probabilities. For both models, the bulk of this computation involves evaluating the softmax function over a large word or class vocabulary to ensure that probabilities sum to 1. In this paper, we study unnormalized variants of Model M and NNLM´s, whereby the softmax function is simply omitted. Accordingly, model training must be modified to encourage scores to sum close to 1. In this paper, we demonstrate up to a factor of 35 faster n-gram lookups with unnormalized models over their normalized counterparts, while still yielding state-of-the-art performance in WER (10.2 on the English broadcast news rt04 set).
Keywords
neural nets; probability; NNLM; exponential class based language model; n-gram lookups; neural network language models; softmax function; word n-gram language models; word probabilities; Acoustics; Artificial neural networks; Joints; Reactive power; Training; Model M; fast lookup; neural network language models; unnormalized models;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7179006
Filename
7179006
Link To Document