• DocumentCode
    1051713
  • Title

    Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model

  • Author

    Bengio, Yoshua ; Senécal, Jean-Sébastien

  • Author_Institution
    Univ. de Montreal, Montreal
  • Volume
    19
  • Issue
    4
  • fYear
    2008
  • fDate
    4/1/2008 12:00:00 AM
  • Firstpage
    713
  • Lastpage
    722
  • Abstract
    Previous work on statistical language modeling has shown that it is possible to train a feedforward neural network to approximate probabilities over sequences of words, resulting in significant error reduction when compared to standard baseline models based on n-grams. However, training the neural network model with the maximum-likelihood criterion requires computations proportional to the number of words in the vocabulary. In this paper, we introduce adaptive importance sampling as a way to accelerate training of the model. The idea is to use an adaptive n-gram model to track the conditional distributions produced by the neural network. We show that a very significant speedup can be obtained on standard problems.
  • Keywords
    computational linguistics; feedforward neural nets; importance sampling; learning (artificial intelligence); maximum likelihood estimation; natural language processing; adaptive importance sampling; adaptive n-gram model; feedforward neural network; maximum-likelihood criterion; neural network model training; neural probabilistic language model; vocabulary; words sequences; Energy-based models; Monte Carlo methods; fast training; importance sampling; language modeling; probabilistic neural networks; Computer Simulation; Humans; Language; Markov Chains; Models, Statistical; Neural Networks (Computer); Programming Languages;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/TNN.2007.912312
  • Filename
    4443871