• DocumentCode
    179572
  • Title

    Impact of single-microphone dereverberation on DNN-based meeting transcription systems

  • Author

    Yoshioka, Takashi ; Xie Chen ; Gales, Mark J.F.

  • Author_Institution
    Eng. Dept., Cambridge Univ., Cambridge, UK
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    5527
  • Lastpage
    5531
  • Abstract
    Over the past few decades, a range of front-end techniques have been proposed to improve the robustness of automatic speech recognition systems against environmental distortion. While these techniques are effective for small tasks consisting of carefully designed data sets, especially when used with a classical acoustic model, there has been limited evidence that they are useful for a state-of-the-art system with large scale realistic data. This paper focuses on reverberation as a type of distortion and investigates the degree to which dereverberation processing can improve the performance of various forms of acoustic models based on deep neural networks (DNNs) in a challenging meeting transcription task using a single distant microphone. Experimental results show that dereverberation improves the recognition performance regardless of the acoustic model structure and the type of the feature vectors input into the neural networks, providing additional relative improvements of 4.7% and 4.1% to our best configured speaker-independent and speaker-adaptive DNN-based systems, respectively.
  • Keywords
    acoustic signal processing; microphones; neural nets; reverberation; speaker recognition; DNN-based meeting transcription systems; acoustic model structure; automatic speech recognition systems; classical acoustic model; deep neural networks; dereverberation processing; environmental distortion; feature vectors; front-end techniques; recognition performance; single distant microphone; single-microphone dereverberation; speaker-adaptive DNN-based systems; speaker-independent DNN-based systems; Hidden Markov models; Reverberation; Silicon; Speech; Speech processing; Vectors; Environmental robustness; deep neural network; meeting transcription; reverberation; single distant microphone;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854660
  • Filename
    6854660