Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning

Author

Giri, Ritwik ; Seltzer, Michael L. ; Droppo, Jasha ; Dong Yu

Author_Institution

Univ. of California, San Diego, La Jolla, CA, USA

fYear

2015

fDate

19-24 April 2015

Firstpage

5014

Lastpage

5018

Abstract

In this paper, we propose two approaches to improve deep neural network (DNN) acoustic models for speech recognition in reverberant environments. Both methods utilize auxiliary information in training the DNN but differ in the type of information and the manner in which it is used. The first method uses parallel training data for multi-task learning, in which the network is trained to perform both a primary senone classification task and a secondary feature enhancement task using a shared representation. The second method uses a parameterization of the reverberant environment extracted from the observed signal to train a room-aware DNN. Experiments were performed on the single microphone task of the REVERB Challenge corpus. The proposed approach obtained a word error rate of 7.8% on the SimData test set, which is lower than all reported systems using the same training data and evaluation conditions, and 27.5% on the mismatched RealData test set, which is lower than all but two systems.

Keywords

acoustic noise; learning (artificial intelligence); neural nets; reverberation; signal classification; speech processing; speech recognition; DNN acoustic models; DNN training; REVERB Challenge corpus; SimData test set; auxiliary information; mismatched RealData test set; multitask learning; parallel training data; primary senone classification task; reverberant environments; reverberation; room aware DNN; room aware deep neural network; secondary feature enhancement task; shared representation; single microphone task; speech recognition; Microphones; Reverberation; Speech; Speech recognition; Training; Training data; Multi-task learning; deep neural network; reverberation; room impulse response;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178925

Filename

7178925