Title :
Fast and flexible Kullback-Leibler divergence based acoustic modeling for non-native speech recognition
Author :
Imseng, David ; Rasipuram, Ramya ; Magimai-Doss, Mathew
Abstract :
One of the main challenge in non-native speech recognition is how to handle acoustic variability present in multi-accented non-native speech with limited amount of training data. In this paper, we investigate an approach that addresses this challenge by using Kullback-Leibler divergence based hidden Markov models (KL-HMM). More precisely, the acoustic variability in the multi-accented speech is handled by using multilingual phoneme posterior probabilities, estimated by a multilayer perceptron trained on auxiliary data, as input feature for the KL-HMM system. With limited training data, we then build better acoustic models by exploiting the advantage that the KL-HMM system has fewer number of parameters. On HIWIRE corpus, the proposed approach yields a performance of 1.9% word error rate (WER) with 149 minutes of training data and a performance of 5.5% WER with 2 minutes of training data.
Keywords :
acoustic signal processing; error statistics; hidden Markov models; multilayer perceptrons; natural language processing; probability; speech recognition; HIWIRE corpus; KL-HMM system; Kullback-Leibler divergence based acoustic modeling; Kullback-Leibler divergence based hidden Markov model; acoustic variability; auxiliary data; multiaccented nonnative speech; multilayer perceptron; multilingual phoneme posterior probabilities; nonnative speech recognition; word error rate; Acoustics; Adaptation models; Feature extraction; Hidden Markov models; Speech; Speech recognition; Training; Kullback-Leibler divergence; Non-native speech recognition; hidden Markov model; multilayer perceptron; posterior features;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
DOI :
10.1109/ASRU.2011.6163956