• DocumentCode
    730847
  • Title

    Unnormalized exponential and neural network language models

  • Author

    Sethy, Abhinav ; Chen, Stanley ; Arisoy, Ebru ; Ramabhadran, Bhuvana

  • Author_Institution
    IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    5416
  • Lastpage
    5420
  • Abstract
    Model M, an exponential class-based language model, and neural network language models (NNLM´s) have outperformed word n-gram language models over a wide range of tasks. However, these gains come at the cost of vastly increased computation when calculating word probabilities. For both models, the bulk of this computation involves evaluating the softmax function over a large word or class vocabulary to ensure that probabilities sum to 1. In this paper, we study unnormalized variants of Model M and NNLM´s, whereby the softmax function is simply omitted. Accordingly, model training must be modified to encourage scores to sum close to 1. In this paper, we demonstrate up to a factor of 35 faster n-gram lookups with unnormalized models over their normalized counterparts, while still yielding state-of-the-art performance in WER (10.2 on the English broadcast news rt04 set).
  • Keywords
    neural nets; probability; NNLM; exponential class based language model; n-gram lookups; neural network language models; softmax function; word n-gram language models; word probabilities; Acoustics; Artificial neural networks; Joints; Reactive power; Training; Model M; fast lookup; neural network language models; unnormalized models;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7179006
  • Filename
    7179006