Title :
Acoustic modeling with neural graph embeddings
Author :
Yuzong Liu;Katrin Kirchhoff
Author_Institution :
Department of Electrical Engineering, University of Washington, Seattle, WA 98195
Abstract :
Graph-based learning (GBL) is a form of semi-supervised learning that has been successfully exploited in acoustic modeling in the past. It utilizes manifold information in speech data that is represented as a joint similarity graph over training and test samples. Typically, GBL is used at the output level of an acoustic classifier; however, this setup is difficult to scale to large data sets, and the graph-based learner is not optimized jointly with other components of the speech recognition system. In this paper we explore a different approach where the similarity graph is first embedded into continuous space using a neural autoencoder. Features derived from this encoding are then used at the input level to a standard DNN-based speech recognizer. We demonstrate improved scalability and performance compared to the standard GBL approach as well as significant improvements in word error rate on a medium-vocabulary Switchboard task.
Keywords :
"Acoustics","Hidden Markov models","Training","Standards","Data models","Error analysis","Encoding"
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
DOI :
10.1109/ASRU.2015.7404848