DocumentCode :
3744857
Title :
Towards utterance-based neural network adaptation in acoustic modeling
Author :
Ivan Himawan;Petr Motlicek;Marc Ferras Font;Srikanth Madikeri
Author_Institution :
Idiap Research Institute, Martigny, Switzerland
fYear :
2015
Firstpage :
289
Lastpage :
295
Abstract :
Despite the superior classification ability of deep neural networks (DNN), the performance of DNN suffers when there is a mismatch between training and testing conditions. Many speaker adaptation techniques have been proposed for DNN acoustic modeling but in case of environmental robustness the progress is still limited. It is also possible to use techniques developed for adapting speakers to handle the impact of environments at the same time, or to combine both approaches. Directly adapting the large number of DNN parameters is challenging when the adaptation set is small. The learning hidden unit contributions (LHUC) technique for unsupervised speaker adaptation of DNN introduces speaker dependent parameters to the existing speaker independent network to increase the automatic speech recognition (ASR) performance of the target speaker using small amounts of adaptation data. This paper investigates the LHUC to adapt the speech recognizer to target speakers and environments where the impacts of speakers and noise differences are quantified separately. Our finding shows that the LHUC is capable of adapting to both speaker and noise conditions at the same time. Compared to the speaker independent model, about 9% to 13% relative word error rate (WER) improvement are observed for all test conditions using AMI meeting corpus.
Keywords :
"Adaptation models","Hidden Markov models","Acoustics","Training","Speech","Data models","Signal to noise ratio"
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
Type :
conf
DOI :
10.1109/ASRU.2015.7404807
Filename :
7404807
Link To Document :
بازگشت