DocumentCode :
1051713
Title :
Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model
Author :
Bengio, Yoshua ; Senécal, Jean-Sébastien
Author_Institution :
Univ. de Montreal, Montreal
Volume :
19
Issue :
4
fYear :
2008
fDate :
4/1/2008 12:00:00 AM
Firstpage :
713
Lastpage :
722
Abstract :
Previous work on statistical language modeling has shown that it is possible to train a feedforward neural network to approximate probabilities over sequences of words, resulting in significant error reduction when compared to standard baseline models based on n-grams. However, training the neural network model with the maximum-likelihood criterion requires computations proportional to the number of words in the vocabulary. In this paper, we introduce adaptive importance sampling as a way to accelerate training of the model. The idea is to use an adaptive n-gram model to track the conditional distributions produced by the neural network. We show that a very significant speedup can be obtained on standard problems.
Keywords :
computational linguistics; feedforward neural nets; importance sampling; learning (artificial intelligence); maximum likelihood estimation; natural language processing; adaptive importance sampling; adaptive n-gram model; feedforward neural network; maximum-likelihood criterion; neural network model training; neural probabilistic language model; vocabulary; words sequences; Energy-based models; Monte Carlo methods; fast training; importance sampling; language modeling; probabilistic neural networks; Computer Simulation; Humans; Language; Markov Chains; Models, Statistical; Neural Networks (Computer); Programming Languages;
fLanguage :
English
Journal_Title :
Neural Networks, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9227
Type :
jour
DOI :
10.1109/TNN.2007.912312
Filename :
4443871
Link To Document :
بازگشت