مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker adaptive Kullback-Leibler divergence based hidden Markov models

DocumentCode :

1693317

Title :

Speaker adaptive Kullback-Leibler divergence based hidden Markov models

Author :

Imseng, David ; Bourlard, Herve

Author_Institution :

Idiap Res. Inst., Martigny, Switzerland

fYear :

2013

Firstpage :

7913

Lastpage :

7917

Abstract :

Kullback-Leibler divergence based hidden Markov models (KL-HMM) have recently been introduced as an efficient and principled way to directly model sequences of posterior vectors to perform Automatic Speech Recognition (ASR). Through efficient feature level adaptation and parsimonious use of parameters, KL-HMM was successfully applied to accented and under-resourced speech recognition tasks. In this paper, inspired from Maximum A Posteriori (MAP) adaptation, we further boost KL-HMM performance by applying Bayesian speaker adaptation, directly applied to posterior features. This approach performs a simple, adaptive regression between phone posteriors estimated with a Multilayer Perceptron (MLP) on large amounts of speaker-independent training data, and speaker-specific phone posteriors generated by the speaker-independent MLP on very limited amount of speaker-specific adaptation data. Using Swiss French data (MediaParl), we show that such speaker adaptive KL-HMM can significantly outperform conventional adaptation techniques on non-native speech while yielding similar performance on native data.

Keywords :

hidden Markov models; maximum likelihood estimation; multilayer perceptrons; speaker recognition; ASR; Bayesian speaker adaptation; KL-HMM; MAP adaptation; MLP; MediaParl; Swiss French data; accented speech recognition tasks; adaptive regression; automatic speech recognition; feature level adaptation; hidden Markov models; maximum a posteriori adaptation; multilayer perceptron; posterior vectors; speaker adaptive Kullback-Leibler divergence; speaker-independent training data; speaker-specific phone posteriors; under-resourced speech recognition tasks; Acoustics; Databases; Hidden Markov models; Speech; Speech recognition; Standards; Training; Kullback-Leibler divergence; non-native speech; speaker adaptation; speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6639205

Filename :

6639205

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1693317