• DocumentCode
    730677
  • Title

    Far-field speech recognition using CNN-DNN-HMM with convolution in time

  • Author

    Yoshioka, Takuya ; Karita, Shigeki ; Nakatani, Tomohiro

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4360
  • Lastpage
    4364
  • Abstract
    Recent studies in speech recognition have shown that the performance of convolutional neural networks (CNNs) is superior to that of fully connected deep neural networks (DNNs). In this paper, we explore the use of CNNs in far-field speech recognition for dealing with reverberation, which blurs spectral energies along the time axis. Unlike most previous CNN applications to speech recognition, we consider convolution in time to examine whether it provides an improved reverberation modelling capability. Experimental results show that a CNN coupled with a fully connected DNN can model short time correlations in feature vectors with fewer parameters than a DNN and thus generalise better to unseen test environments. Combining this approach with signal-space dereverberation, which copes with long-term correlations, is shown to result in further improvement, where the gains from both approaches are almost additive. An initial investigation of the use of restricted convolution forms is also undertaken.
  • Keywords
    convolution; hidden Markov models; neural nets; speech recognition; CNN-DNN-HMM; convolutional neural networks; deep neural networks; far-field speech recognition; feature vectors; hidden Markov model; short time correlations; signal-space dereverberation; spectral energy; time axis; time convolution; Convolution; Hidden Markov models; Neural networks; Reverberation; Speech; Speech recognition; Far-field speech recognition; convolutional neural network; deep neural network; reverberation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178794
  • Filename
    7178794