مرکز منطقه ای اطلاع رساني علوم و فناوري - Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation

DocumentCode :

940007

Title :

Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation

Author :

Zhou, Bowen ; Hansen, John H L

Author_Institution :

Robust Speech Process. Group, Univ. of Colorado, Boulder, CO, USA

Volume :

Issue :

fYear :

2005

fDate :

7/1/2005 12:00:00 AM

Firstpage :

554

Lastpage :

564

Abstract :

It is widely believed that strong correlations exist across an utterance as a consequence of time-invariant characteristics of speaker and acoustic environments. It is verified in this paper that the first primary eigendirections of the utterance covariance matrix are speaker dependent. Based on this observation, a novel family of fast speaker adaptation algorithms entitled Eigenspace Mapping (EigMap) is proposed. The proposed algorithms are applied to continuous density Hidden Markov Model (HMM) based speech recognition. The EigMap algorithm rapidly constructs discriminative acoustic models in the test speaker´s eigenspace by preserving discriminative information learned from baseline models in the directions of the test speaker´s eigenspace. Moreover, the adapted models are compressed by discarding model parameters that are assumed to contain no discrimination information. The core idea of EigMap can be extended in many ways, and a family of algorithms based on EigMap is described in this paper. Unsupervised adaptation experiments show that EigMap is effective in improving baseline models using very limited amounts of adaptation data with superior performance to conventional adaptation techniques such as MLLR and block diagonal MLLR. A relative improvement of 18.4% over a baseline recognizer is achieved using EigMap with only about 4.5 s of adaptation data. Furthermore, it is also demonstrated that EigMap is additive to MLLR by encompassing important speaker dependent discriminative information. A significant relative improvement of 24.6% over baseline is observed using 4.5 s of adaptation data by combining MLLR and EigMap techniques.

Keywords :

covariance matrices; eigenvalues and eigenfunctions; hidden Markov models; speech recognition; eigenspace mapping; fast speaker adaptation; hidden Markov model; rapid discriminative acoustic model; speaker dependent discriminative information; speech recognition; utterance covariance matrix; Acoustic testing; Additives; Covariance matrix; Hidden Markov models; Linear regression; Loudspeakers; Maximum likelihood linear regression; Robustness; Speech processing; Speech recognition; Discriminative acoustic model; eigenspace mapping; hidden Markov models; rapid speaker adaptation; speech recognition;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/TSA.2005.845808

Filename :

1453598

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=940007