Title :
Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition
Author :
Xiangang Li ; Xihong Wu
Author_Institution :
Speech & Hearing Res. Center, Peking Univ., Beijing, China
Abstract :
Long short-term memory (LSTM) based acoustic modeling methods have recently been shown to give state-of-the-art performance on some speech recognition tasks. To achieve a further performance improvement, in this research, deep extensions on LSTM are investigated considering that deep hierarchical model has turned out to be more efficient than a shallow one. Motivated by previous research on constructing deep recurrent neural networks (RNNs), alternative deep LSTM architectures are proposed and empirically evaluated on a large vocabulary conversational telephone speech recognition task. Meanwhile, regarding to multi-GPU devices, the training process for LSTM networks is introduced and discussed. Experimental results demonstrate that the deep LSTM networks benefit from the depth and yield the state-of-the-art performance on this task.
Keywords :
graphics processing units; learning (artificial intelligence); recurrent neural nets; speech recognition; RNN; acoustic modeling method; deep LSTM network; deep recurrent neural network; hierarchical model; large vocabulary speech recognition; long short-term memory; multiGPU device; telephone speech recognition task; training process; Acoustics; Computer architecture; Hidden Markov models; Recurrent neural networks; Speech; Speech recognition; Training; acoustic modeling; deep neural networks; large vocabulary speech recognition; long short-term memory; recurrent neural networks;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178826