Title :
Unsupervised discriminative adaptation using discriminative mapping transforms
Author :
Yu, K. ; Gales, M.J.F. ; Woodland, P.C.
Author_Institution :
Eng. Dept., Cambridge Univ., Cambridge
fDate :
March 31 2008-April 4 2008
Abstract :
The most commonly used approaches to speaker adaptation are based on linear transforms, as these can be robustly estimated using limited adaptation data. Although significant gains can be obtained using discriminative criteria for training acoustic models, maximum likelihood (ML) estimated transforms are used for unsupervised adaptation. This is because discriminatively trained transforms are highly sensitive to errors in the adaptation hypothesis. This paper describes a new framework for estimating transforms that are discriminative in nature, but are less sensitive to this hypothesis issue. A discriminative, speaker-independent, mapping transformation is estimated during training. This transform is obtained after a speaker-specific ML-estimated transform has been applied. During recognition an ML speaker-specific transform is found and the speaker-independent discriminative mapping transform then applied. This allows a transform which is discriminative in nature to be indirectly estimated, whilst only requiring an ML speaker-specific transform to be found during recognition. The scheme is evaluated on an English conversational telephone speech task, where it significantly outperforms both standard ML and discriminatively trained transforms.
Keywords :
maximum likelihood estimation; speech processing; speech recognition; transforms; English conversational telephone speech task; acoustic models; discriminative mapping transforms; linear transforms; maximum likelihood estimated transforms; speaker adaptation; speaker-independent mapping transform; speaker-specific ML-estimated transform; unsupervised adaptation; unsupervised discriminative adaptation; Acoustical engineering; Data engineering; Loudspeakers; Maximum likelihood estimation; Maximum likelihood linear regression; OFDM modulation; Robustness; Speech; Telephony; Training data; Speaker adaptation; discriminative training;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518599