Title :
Maximum Penalized Likelihood Kernel Regression for Fast Adaptation
Author :
Mak, Brian Kan-Wing ; Lai, Tsz-Chung ; Tsang, Ivor W. ; Kwok, James Tin-Yau
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Abstract :
This paper proposes a nonlinear generalization of the popular maximum-likelihood linear regression (MLLR) adaptation algorithm using kernel methods. The proposed method, called maximum penalized likelihood kernel regression adaptation (MPLKR), applies kernel regression with appropriate regularization to determine the affine model transform in a kernel-induced high-dimensional feature space. Although this is not the first attempt of applying kernel methods to conventional linear adaptation algorithms, unlike most of other kernelized adaptation methods such as kernel eigenvoice or kernel eigen-MLLR, MPLKR has the advantage that it is a convex optimization and its solution is always guaranteed to be globally optimal. In fact, the adapted Gaussian means can be obtained analytically by simply solving a system of linear equations. From the Bayesian perspective, MPLKR can also be considered as the kernel version of maximum a posteriori linear regression (MAPLR) adaptation. Supervised and unsupervised speaker adaptation using MPLKR were evaluated on the Resource Management and Wall Street Journal 5K tasks, respectively, achieving a word error rate reduction of 23.6% and 15.5% respectively over the speaker-independently model.
Keywords :
Gaussian processes; maximum likelihood estimation; optimisation; regression analysis; speaker recognition; Resource Management and Wall Street Journal; adapted Gaussian means; affine model transform; convex optimization; kernel eigen-MLLR; kernel eigenvoice; kernel-induced high-dimensional feature space; linear equations; maximum a posteriori linear regression adaptation; maximum penalized likelihood kernel regression adaptation; maximum-likelihood linear regression adaptation algorithm; supervised speaker adaptation; unsupervised speaker adaptation; word error rate reduction; Adaptation model; Bayesian methods; Equations; Kernel; Linear regression; Loudspeakers; Maximum likelihood linear regression; Optimization methods; Regression tree analysis; Speech; Kernel regression; maximum-likelihood linear regression (MLLR); reference speaker weighting; speaker adaptation;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2009.2019920