DocumentCode
1281519
Title
Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups
Author
Hinton, Geoffrey ; Deng, Li ; Yu, Dong ; Dahl, George E. ; Mohamed, Abdel-rahman ; Jaitly, Navdeep ; Senior, Andrew ; Vanhoucke, Vincent ; Nguyen, Patrick ; Sainath, Tara N. ; Kingsbury, Brian
Volume
29
Issue
6
fYear
2012
Firstpage
82
Lastpage
97
Abstract
Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition benchmarks, sometimes by a large margin. This article provides an overview of this progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.
Keywords
Gaussian processes; feedforward neural nets; hidden Markov models; speech recognition; Gaussian mixture models; HMM states; acoustic modeling; deep neural networks; feed-forward neural network; hidden Markov models; posterior probabilities; speech recognition; temporal variability; Acoustics; Automatic speech recognition; Data models; Gaussian processes; Hidden Markov models; Neural networks; Speech recognition; Training;
fLanguage
English
Journal_Title
Signal Processing Magazine, IEEE
Publisher
ieee
ISSN
1053-5888
Type
jour
DOI
10.1109/MSP.2012.2205597
Filename
6296526
Link To Document