Title :
Speaker adaptation using constrained transformation
Author :
Wu, Xintian ; Yan, Yonghong
Author_Institution :
Intel Corp., Santa Clara, CA, USA
fDate :
3/1/2004 12:00:00 AM
Abstract :
In speech recognition research, transformation-based adaptation algorithms provide an effective way of adapting acoustic models to improve the recognition accuracy. However, when only limited amounts of adaptation data are available, the transformation is often poorly estimated, which may cause performance degradation. This paper presents the Markov Random Field Linear Regression (MRFLR) algorithm, which constrains the transformation-based adaptation by the correlations among acoustic parameters. The Markov Random Field theory is used to model the correlations. The correlations are estimated from the training corpus and hypothesized as prior knowledge of acoustic models. By explicitly incorporating them into adaptation, robust and fast adaptation can be achieved. The hypothesis is tested by comparing MRFLR with MLLR (Maximum Likelihood Linear Regression), a widely used transformation-based adaptation algorithm. Experimental results show that MRFLR outperforms MLLR when adaptation data are sparse, and converges to the MLLR performance when more adaptation data are available.
Keywords :
Markov processes; regression analysis; speaker recognition; Markov random field linear regression algorithm; acoustic models; acoustic parameter correlation; constrained transformation; maximum likelihood linear regression; recognition accuracy; speaker adaptation; Acoustic testing; Adaptation model; Data mining; Degradation; Linear regression; Loudspeakers; Markov random fields; Maximum likelihood linear regression; Robustness; Speech recognition;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
DOI :
10.1109/TSA.2003.818029