• DocumentCode
    730782
  • Title

    Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning

  • Author

    Giri, Ritwik ; Seltzer, Michael L. ; Droppo, Jasha ; Dong Yu

  • Author_Institution
    Univ. of California, San Diego, La Jolla, CA, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    5014
  • Lastpage
    5018
  • Abstract
    In this paper, we propose two approaches to improve deep neural network (DNN) acoustic models for speech recognition in reverberant environments. Both methods utilize auxiliary information in training the DNN but differ in the type of information and the manner in which it is used. The first method uses parallel training data for multi-task learning, in which the network is trained to perform both a primary senone classification task and a secondary feature enhancement task using a shared representation. The second method uses a parameterization of the reverberant environment extracted from the observed signal to train a room-aware DNN. Experiments were performed on the single microphone task of the REVERB Challenge corpus. The proposed approach obtained a word error rate of 7.8% on the SimData test set, which is lower than all reported systems using the same training data and evaluation conditions, and 27.5% on the mismatched RealData test set, which is lower than all but two systems.
  • Keywords
    acoustic noise; learning (artificial intelligence); neural nets; reverberation; signal classification; speech processing; speech recognition; DNN acoustic models; DNN training; REVERB Challenge corpus; SimData test set; auxiliary information; mismatched RealData test set; multitask learning; parallel training data; primary senone classification task; reverberant environments; reverberation; room aware DNN; room aware deep neural network; secondary feature enhancement task; shared representation; single microphone task; speech recognition; Microphones; Reverberation; Speech; Speech recognition; Training; Training data; Multi-task learning; deep neural network; reverberation; room impulse response;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178925
  • Filename
    7178925