DocumentCode :
82061
Title :
From Feedforward to Recurrent LSTM Neural Networks for Language Modeling
Author :
Sundermeyer, Martin ; Ney, Hermann ; Schluter, Ralf
Author_Institution :
Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany
Volume :
23
Issue :
3
fYear :
2015
fDate :
Mar-15
Firstpage :
517
Lastpage :
529
Abstract :
Language models have traditionally been estimated based on relative frequencies, using count statistics that can be extracted from huge amounts of text data. More recently, it has been found that neural networks are particularly powerful at estimating probability distributions over word sequences, giving substantial improvements over state-of-the-art count models. However, the performance of neural network language models strongly depends on their architectural structure. This paper compares count models to feedforward, recurrent, and long short-term memory (LSTM) neural network variants on two large-vocabulary speech recognition tasks. We evaluate the models in terms of perplexity and word error rate, experimentally validating the strong correlation of the two quantities, which we find to hold regardless of the underlying type of the language model. Furthermore, neural networks incur an increased computational complexity compared to count models, and they differently model context dependences, often exceeding the number of words that are taken into account by count based approaches. These differences require efficient search methods for neural networks, and we analyze the potential improvements that can be obtained when applying advanced algorithms to the rescoring of word lattices on large-scale setups.
Keywords :
error statistics; feedforward neural nets; linguistics; natural language processing; recurrent neural nets; speech recognition; vocabulary; computational complexity; context dependences; count models; feedforward LSTM neural networks; language modeling; large-vocabulary speech recognition tasks; long short-term memory neural network; perplexity; quantities correlation; recurrent LSTM neural networks; word error rate; word lattices; Context; Feedforward neural networks; Lattices; Recurrent neural networks; Speech; Training; Feedforward neural network; Kneser-Ney smoothing; language modeling; long short-term memory (LSTM); recurrent neural network (RNN);
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
2329-9290
Type :
jour
DOI :
10.1109/TASLP.2015.2400218
Filename :
7050391
Link To Document :
بازگشت