DocumentCode :
1118214
Title :
Kernel Eigenspace-Based MLLR Adaptation
Author :
Mak, Brian Kan-Wing ; Hsiao, Roger Wend-Huu
Author_Institution :
Dept. ofComputer Sci., Hong Kong Univ. of Sci. & Technol.
Volume :
15
Issue :
3
fYear :
2007
fDate :
3/1/2007 12:00:00 AM
Firstpage :
784
Lastpage :
795
Abstract :
In this paper, we propose an application of kernel methods for fast speaker adaptation based on kernelizing the eigenspace-based maximum-likelihood linear regression adaptation method. We call our new method "kernel eigenspace-based maximum-likelihood linear regression adaptation" (KEMLLR). In KEMLLR, speaker-dependent (SD) models are estimated from a common speaker-independent (SI) model using MLLR adaptation, and the MLLR transformation matrices are mapped to a kernel-induced high-dimensional feature space, wherein kernel principal component analysis is used to derive a set of eigenmatrices. In addition, a composite kernel is used to preserve row information in the transformation matrices. A new speaker\´s MLLR transformation matrix is then represented as a linear combination of the leading kernel eigenmatrices, which, though exists only in the feature space, still allows the speaker\´s mean vectors to be found explicitly. As a result, at the end of KEMLLR adaptation, a regular hidden Markov model (HMM) is obtained for the new speaker and subsequent speech recognition is as fast as normal HMM decoding. KEMLLR adaptation was tested and compared with other adaptation methods on the Resource Management and Wall Street Journal tasks using 5 or 10 s of adaptation speech. In both cases, KEMLLR adaptation gives the greatest improvement over the SI model with 11%-20% word error rate reduction
Keywords :
eigenvalues and eigenfunctions; error statistics; hidden Markov models; matrix algebra; maximum likelihood estimation; principal component analysis; regression analysis; speaker recognition; HMM decoding; MLLR transformation matrices; Wall Street Journal task; fast speaker adaptation; hidden Markov model; kernel eigenmatrices; kernel eigenspace-based MLLR adaptation; kernel principal component analysis; kernel-induced high-dimensional feature space; maximum-likelihood linear regression adaptation method; resource management task; speaker mean vectors; speaker-dependent model; speech recognition; word error rate reduction; Adaptation model; Hidden Markov models; Kernel; Linear regression; Maximum likelihood decoding; Maximum likelihood estimation; Maximum likelihood linear regression; Principal component analysis; Speech recognition; Vectors; Broyden–Fletcher–Goldfarb–Shanno (BFGS) optimization; composite kernels; eigenspace-based maximum-likelihood linear regression (MLLR) adaptation; eigenvoice speaker adaptation; embedded kernel eigenvoice adaptation; kernel eigenvoice adaptation; kernel principal component analysis (PCA);
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2006.885941
Filename :
4100690
Link To Document :
بازگشت