• DocumentCode
    2175189
  • Title

    Large vocabulary continuous speech recognition with context-dependent DBN-HMMS

  • Author

    Dahl, George E. ; Yu, Dong ; Deng, Li ; Acero, Alex

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Toronto, Toronto, ON, Canada
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    4688
  • Lastpage
    4691
  • Abstract
    The context-independent deep belief network (DBN) hidden Markov model (HMM) hybrid architecture has recently achieved promising results for phone recognition. In this work, we propose a context-dependent DBN-HMM system that dramatically outperforms strong Gaussian mixture model (GMM)-HMM baselines on a challenging, large vocabulary, spontaneous speech recognition dataset from the Bing mobile voice search task. Our system achieves absolute sentence accuracy improvements of 5.8% and 9.2% over GMM-HMMs trained using the minimum phone error rate (MPE) and maximum likelihood (ML) criteria, respectively, which translate to relative error reductions of 16.0% and 23.2%.
  • Keywords
    Gaussian distribution; belief networks; hidden Markov models; speech recognition; Bing mobile voice search task; Gaussian mixture model; ML criteria; MPE; context-dependent DBN-HMMS; context-independent deep belief network; hidden Markov model hybrid architecture; large vocabulary continuous speech recognition; maximum likelihood criteria; minimum phone error rate; Accuracy; Acoustics; Artificial neural networks; Hidden Markov models; Speech recognition; Training; Vocabulary; DBN-HMM; LVCSR; Speech recognition; context-dependent phone; deep belief network;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947401
  • Filename
    5947401