• DocumentCode
    3744898
  • Title

    Acoustic modeling with neural graph embeddings

  • Author

    Yuzong Liu;Katrin Kirchhoff

  • Author_Institution
    Department of Electrical Engineering, University of Washington, Seattle, WA 98195
  • fYear
    2015
  • Firstpage
    581
  • Lastpage
    588
  • Abstract
    Graph-based learning (GBL) is a form of semi-supervised learning that has been successfully exploited in acoustic modeling in the past. It utilizes manifold information in speech data that is represented as a joint similarity graph over training and test samples. Typically, GBL is used at the output level of an acoustic classifier; however, this setup is difficult to scale to large data sets, and the graph-based learner is not optimized jointly with other components of the speech recognition system. In this paper we explore a different approach where the similarity graph is first embedded into continuous space using a neural autoencoder. Features derived from this encoding are then used at the input level to a standard DNN-based speech recognizer. We demonstrate improved scalability and performance compared to the standard GBL approach as well as significant improvements in word error rate on a medium-vocabulary Switchboard task.
  • Keywords
    "Acoustics","Hidden Markov models","Training","Standards","Data models","Error analysis","Encoding"
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
  • Type

    conf

  • DOI
    10.1109/ASRU.2015.7404848
  • Filename
    7404848