مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning factorized feature transforms for speaker normalization

DocumentCode :

3744837

Title :

Learning factorized feature transforms for speaker normalization

Author :

Lahiru Samarakoon;Khe Chai Sim

Author_Institution :

School of Computing, National University of Singapore, Singapore

fYear :

2015

Firstpage :

145

Lastpage :

152

Abstract :

This paper proposes an approach to improve automatic speech recognition (ASR) by normalizing the speaker variability of a well trained Deep Neural Network (DNN) acoustic model using i-vectors. Our approach learns a speaker dependent transformation of the acoustic features combined with the standard speaker dependent bias, to minimize the mismatch due to the inter-speaker variability. Speaker normalization experiments on the Aurora 4 task show 10.9% relative improvement over the baseline. Moreover, the proposed approach reported 4.5% relative improvement over the standard i-vector based method where only a speaker dependent bias is used. Furthermore, we report an analysis to compare our approach with the Constrained Maximum Likelihood Linear Regression (CMLLR) method.

Keywords :

"Adaptation models","Training","Hidden Markov models","Estimation","Mathematical model","Acoustics","Transforms"

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on

Type :

conf

DOI :

10.1109/ASRU.2015.7404787

Filename :

7404787

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3744837