Title :
Speaker adaptive Kullback-Leibler divergence based hidden Markov models
Author :
Imseng, David ; Bourlard, Herve
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
Abstract :
Kullback-Leibler divergence based hidden Markov models (KL-HMM) have recently been introduced as an efficient and principled way to directly model sequences of posterior vectors to perform Automatic Speech Recognition (ASR). Through efficient feature level adaptation and parsimonious use of parameters, KL-HMM was successfully applied to accented and under-resourced speech recognition tasks. In this paper, inspired from Maximum A Posteriori (MAP) adaptation, we further boost KL-HMM performance by applying Bayesian speaker adaptation, directly applied to posterior features. This approach performs a simple, adaptive regression between phone posteriors estimated with a Multilayer Perceptron (MLP) on large amounts of speaker-independent training data, and speaker-specific phone posteriors generated by the speaker-independent MLP on very limited amount of speaker-specific adaptation data. Using Swiss French data (MediaParl), we show that such speaker adaptive KL-HMM can significantly outperform conventional adaptation techniques on non-native speech while yielding similar performance on native data.
Keywords :
hidden Markov models; maximum likelihood estimation; multilayer perceptrons; speaker recognition; ASR; Bayesian speaker adaptation; KL-HMM; MAP adaptation; MLP; MediaParl; Swiss French data; accented speech recognition tasks; adaptive regression; automatic speech recognition; feature level adaptation; hidden Markov models; maximum a posteriori adaptation; multilayer perceptron; posterior vectors; speaker adaptive Kullback-Leibler divergence; speaker-independent training data; speaker-specific phone posteriors; under-resourced speech recognition tasks; Acoustics; Databases; Hidden Markov models; Speech; Speech recognition; Standards; Training; Kullback-Leibler divergence; non-native speech; speaker adaptation; speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639205