Title :
Learning factorized feature transforms for speaker normalization
Author :
Lahiru Samarakoon;Khe Chai Sim
Author_Institution :
School of Computing, National University of Singapore, Singapore
Abstract :
This paper proposes an approach to improve automatic speech recognition (ASR) by normalizing the speaker variability of a well trained Deep Neural Network (DNN) acoustic model using i-vectors. Our approach learns a speaker dependent transformation of the acoustic features combined with the standard speaker dependent bias, to minimize the mismatch due to the inter-speaker variability. Speaker normalization experiments on the Aurora 4 task show 10.9% relative improvement over the baseline. Moreover, the proposed approach reported 4.5% relative improvement over the standard i-vector based method where only a speaker dependent bias is used. Furthermore, we report an analysis to compare our approach with the Constrained Maximum Likelihood Linear Regression (CMLLR) method.
Keywords :
"Adaptation models","Training","Hidden Markov models","Estimation","Mathematical model","Acoustics","Transforms"
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
DOI :
10.1109/ASRU.2015.7404787