Title :
Improving Reference Speaker Weighting Adaptation by the Use of Maximum-Likelihood Reference Speakers
Author :
Mak, Brian ; Lai, Tsz-Chung ; Hsiao, Roger
Author_Institution :
Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., Kowloon
Abstract :
We would like to revisit a simple fast adaptation technique called reference speaker weighting (RSW). RSW is similar to eigenvoice (EV) adaptation, and simply requires the model of a new speaker to lie on the span of a set of reference speaker vectors. In the original RSW, the reference speakers are computed through a hierarchical speaker clustering (HSC) algorithm using information such as the gender and speaking rate. We show in this paper that RSW adaptation may be improved if those training speakers that have the highest likelihoods of the adaptation data are selected as the reference speakers; we call them the maximum-likelihood (ML) reference speakers. When RSW adaptation was evaluated on WSJ0 using 5s of adaptation speech, the word error rate reduction can be boosted from 2.54% to 9.15% by using 10 ML reference speakers instead of reference speakers determined from HSC. Moreover, when compared with EV, MAP, MLLR, and eKEV on fast adaptation, we are surprised that the algorithmically simplest RSW technique actually gives the best performance
Keywords :
eigenvalues and eigenfunctions; maximum likelihood estimation; speaker recognition; eigenvoice adaptation; hierarchical speaker clustering; maximum-likelihood reference speakers; reference speaker weighting adaptation; Adaptation model; Bayesian methods; Clustering algorithms; Computer science; Councils; Error analysis; Kernel; Maximum likelihood linear regression; Natural languages; Speech analysis;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
Print_ISBN :
1-4244-0469-X
DOI :
10.1109/ICASSP.2006.1659999