DocumentCode :
3484665
Title :
Making Deep Belief Networks effective for large vocabulary continuous speech recognition
Author :
Sainath, Tara N. ; Kingsbury, Brian ; Ramabhadran, Bhuvana ; Fousek, Petr ; Novak, Petr ; Mohamed, Abdel-rahman
Author_Institution :
IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
2011
fDate :
11-15 Dec. 2011
Firstpage :
30
Lastpage :
35
Abstract :
To date, there has been limited work in applying Deep Belief Networks (DBNs) for acoustic modeling in LVCSR tasks, with past work using standard speech features. However, a typical LVCSR system makes use of both feature and model-space speaker adaptation and discriminative training. This paper explores the performance of DBNs in a state-of-the-art LVCSR system, showing improvements over Multi-Layer Perceptrons (MLPs) and GMM/HMMs across a variety of features on an English Broadcast News task. In addition, we provide a recipe for data parallelization of DBN training, showing that data parallelization can provide linear speed-up in the number of machines, without impacting WER.
Keywords :
belief networks; multilayer perceptrons; speech recognition; English broadcast news task; LVCSR system; acoustic modeling; data parallelization; deep belief networks; discriminative training; large vocabulary continuous speech recognition; model-space speaker adaptation; multilayer perceptrons; Artificial neural networks; Computers; Hidden Markov models; Mathematical model; Speech recognition; Training; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
Type :
conf
DOI :
10.1109/ASRU.2011.6163900
Filename :
6163900
Link To Document :
بازگشت