DocumentCode :
1690242
Title :
Recurrent neural networks for voice activity detection
Author :
Hughes, Tim ; Mierle, Keir
Author_Institution :
Google, Inc., Mountain View, CA, USA
fYear :
2013
Firstpage :
7378
Lastpage :
7382
Abstract :
We present a novel recurrent neural network (RNN) model for voice activity detection. Our multi-layer RNN model, in which nodes compute quadratic polynomials, outperforms a much larger baseline system composed of Gaussian mixture models (GMMs) and a hand-tuned state machine (SM) for temporal smoothing. All parameters of our RNN model are optimized together, so that it properly weights its preference for temporal continuity against the acoustic features in each frame. Our RNN uses one tenth the parameters and outperforms the GMM+SM baseline system by 26% reduction in false alarms, reducing overall speech recognition computation time by 17% while reducing word error rate by 1% relative.
Keywords :
Gaussian processes; polynomials; recurrent neural nets; speech recognition; GMM; Gaussian mixture models; multilayer RNN model; quadratic polynomials; recurrent neural networks; voice activity detection; Computational modeling; Computer architecture; Delay lines; Hidden Markov models; Recurrent neural networks; Speech; Training; Voice activity detection (VAD); endpointing; recurrent neural networks (RNNs);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6639096
Filename :
6639096
Link To Document :
بازگشت