DocumentCode
730782
Title
Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning
Author
Giri, Ritwik ; Seltzer, Michael L. ; Droppo, Jasha ; Dong Yu
Author_Institution
Univ. of California, San Diego, La Jolla, CA, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
5014
Lastpage
5018
Abstract
In this paper, we propose two approaches to improve deep neural network (DNN) acoustic models for speech recognition in reverberant environments. Both methods utilize auxiliary information in training the DNN but differ in the type of information and the manner in which it is used. The first method uses parallel training data for multi-task learning, in which the network is trained to perform both a primary senone classification task and a secondary feature enhancement task using a shared representation. The second method uses a parameterization of the reverberant environment extracted from the observed signal to train a room-aware DNN. Experiments were performed on the single microphone task of the REVERB Challenge corpus. The proposed approach obtained a word error rate of 7.8% on the SimData test set, which is lower than all reported systems using the same training data and evaluation conditions, and 27.5% on the mismatched RealData test set, which is lower than all but two systems.
Keywords
acoustic noise; learning (artificial intelligence); neural nets; reverberation; signal classification; speech processing; speech recognition; DNN acoustic models; DNN training; REVERB Challenge corpus; SimData test set; auxiliary information; mismatched RealData test set; multitask learning; parallel training data; primary senone classification task; reverberant environments; reverberation; room aware DNN; room aware deep neural network; secondary feature enhancement task; shared representation; single microphone task; speech recognition; Microphones; Reverberation; Speech; Speech recognition; Training; Training data; Multi-task learning; deep neural network; reverberation; room impulse response;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178925
Filename
7178925
Link To Document