• DocumentCode
    730670
  • Title

    Differentiable pooling for unsupervised speaker adaptation

  • Author

    Swietojanski, Pawel ; Renals, Steve

  • Author_Institution
    Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4305
  • Lastpage
    4309
  • Abstract
    This paper proposes a differentiable pooling mechanism to perform model-based neural network speaker adaptation. The proposed technique learns a speaker-dependent combination of activations within pools of hidden units, was shown to work well unsupervised, and does not require speaker-adaptive training. We have conducted a set of experiments on the TED talks data, as used in the IWSLT evaluations. Our results indicate that the approach can reduce word error rates (WERs) on standard IWSLT test sets by about 5-11% relative compared to speaker-independent systems and was found complementary to the recently proposed learning hidden units contribution (LHUC) approach, reducing WER by 6-13% relative. Both methods were also found to work well when adapting with small amounts of unsupervised data - 10 seconds is able to decrease the WER by 5% relative compared to the baseline speaker independent system.
  • Keywords
    loudspeakers; IWSLT evaluations; LHUC; TED; WER; differentiable pooling mechanism; learning hidden units contribution; model-based neural network speaker adaptation; speaker-independent systems; unsupervised speaker adaptation; word error rates; Adaptation models; Artificial neural networks; Lead; Training; Deep Neural Networks; Differentiable pooling; LHUC; Speaker Adaptation; TED;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178783
  • Filename
    7178783