DocumentCode
730705
Title
Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition
Author
Xiangang Li ; Xihong Wu
Author_Institution
Speech & Hearing Res. Center, Peking Univ., Beijing, China
fYear
2015
fDate
19-24 April 2015
Firstpage
4520
Lastpage
4524
Abstract
Long short-term memory (LSTM) based acoustic modeling methods have recently been shown to give state-of-the-art performance on some speech recognition tasks. To achieve a further performance improvement, in this research, deep extensions on LSTM are investigated considering that deep hierarchical model has turned out to be more efficient than a shallow one. Motivated by previous research on constructing deep recurrent neural networks (RNNs), alternative deep LSTM architectures are proposed and empirically evaluated on a large vocabulary conversational telephone speech recognition task. Meanwhile, regarding to multi-GPU devices, the training process for LSTM networks is introduced and discussed. Experimental results demonstrate that the deep LSTM networks benefit from the depth and yield the state-of-the-art performance on this task.
Keywords
graphics processing units; learning (artificial intelligence); recurrent neural nets; speech recognition; RNN; acoustic modeling method; deep LSTM network; deep recurrent neural network; hierarchical model; large vocabulary speech recognition; long short-term memory; multiGPU device; telephone speech recognition task; training process; Acoustics; Computer architecture; Hidden Markov models; Recurrent neural networks; Speech; Speech recognition; Training; acoustic modeling; deep neural networks; large vocabulary speech recognition; long short-term memory; recurrent neural networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178826
Filename
7178826
Link To Document