Title :
A Non-Linear Speaker Adaptation Technique using Kernel Ridge Regression
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY
Abstract :
We propose a non-linear model space transformation for speaker or environment adaptation based on weighted kernel ridge regression (KRR). The transformation is given by a generalized least squares linear regression in a kernel-induced feature space operating on Gaussian mixture model means and having as targets the adaptation frames. Using the "kernel trick", the solution to the optimization problem is obtained by solving a system of linear equations involving the Gram matrix of the input variables. We show that MLLR is a special case of KRR when a linear kernel is employed. Furthermore, we study an efficient low-rank approximation to the kernel matrix termed "rectangle method", where the regressors are chosen to be a small set of clustered adaptation frames. Experiments conducted on the EARS database (English conversational telephone speech) indicate that KRR with a Gaussian RBF kernel outperforms standard regression class-based MLLR
Keywords :
Gaussian processes; least squares approximations; matrix algebra; regression analysis; speaker recognition; English conversational telephone speech; Gaussian mixture model; Gram matrix; generalized least squares linear regression; kernel ridge regression; kernel-induced feature space; linear equations; low-rank approximation; nonlinear speaker adaptation technique; Databases; Ear; Equations; Input variables; Kernel; Least squares approximation; Least squares methods; Linear regression; Maximum likelihood linear regression; Telephony;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
Print_ISBN :
1-4244-0469-X
DOI :
10.1109/ICASSP.2006.1659998