DocumentCode :
730670
Title :
Differentiable pooling for unsupervised speaker adaptation
Author :
Swietojanski, Pawel ; Renals, Steve
Author_Institution :
Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
4305
Lastpage :
4309
Abstract :
This paper proposes a differentiable pooling mechanism to perform model-based neural network speaker adaptation. The proposed technique learns a speaker-dependent combination of activations within pools of hidden units, was shown to work well unsupervised, and does not require speaker-adaptive training. We have conducted a set of experiments on the TED talks data, as used in the IWSLT evaluations. Our results indicate that the approach can reduce word error rates (WERs) on standard IWSLT test sets by about 5-11% relative compared to speaker-independent systems and was found complementary to the recently proposed learning hidden units contribution (LHUC) approach, reducing WER by 6-13% relative. Both methods were also found to work well when adapting with small amounts of unsupervised data - 10 seconds is able to decrease the WER by 5% relative compared to the baseline speaker independent system.
Keywords :
loudspeakers; IWSLT evaluations; LHUC; TED; WER; differentiable pooling mechanism; learning hidden units contribution; model-based neural network speaker adaptation; speaker-independent systems; unsupervised speaker adaptation; word error rates; Adaptation models; Artificial neural networks; Lead; Training; Deep Neural Networks; Differentiable pooling; LHUC; Speaker Adaptation; TED;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178783
Filename :
7178783
Link To Document :
بازگشت