• DocumentCode
    672371
  • Title

    Elastic spectral distortion for low resource speech recognition with deep neural networks

  • Author

    Kanda, Natsuki ; Takeda, Ryu ; Obuchi, Yasunari

  • Author_Institution
    Central Res. Lab., Hitachi Ltd., Kokubunji, Japan
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    309
  • Lastpage
    314
  • Abstract
    An acoustic model based on hidden Markov models with deep neural networks (DNN-HMM) has recently been proposed and achieved high recognition accuracy. In this paper, we investigated an elastic spectral distortion method to artificially augment training samples to help DNN-HMMs acquire enough robustness even when there are a limited number of training samples. We investigated three distortion methods - vocal tract length distortion, speech rate distortion, and frequency-axis random distortion - and evaluated those methods with Japanese lecture recordings. In a large vocabulary continuous speech recognition task with only 10 hours of training samples, a DNN-HMM trained with the elastic spectral distortion method achieved a 10.1% relative word error reduction compared with a normally trained DNN-HMM.
  • Keywords
    acoustic signal processing; hidden Markov models; neural nets; spectral analysis; speech recognition; Japanese lecture recordings; acoustic model; artificially training sample augmentation; deep neural networks; elastic spectral distortion method; frequency-axis random distortion; hidden Markov models; low resource speech recognition; normally trained DNN-HMM; speech rate distortion; vocabulary continuous speech recognition task; vocal tract length distortion; Accuracy; Acoustic distortion; Acoustics; Hidden Markov models; Speech; Speech recognition; Training; Deep neural network; elastic distortion; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707748
  • Filename
    6707748