DocumentCode
3166017
Title
Maximum a posteriori adaptation of subspace Gaussian mixture models for cross-lingual speech recognition
Author
Lu, Liang ; Ghoshal, Arnab ; Renals, Steve
Author_Institution
Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
fYear
2012
fDate
25-30 March 2012
Firstpage
4877
Lastpage
4880
Abstract
This paper concerns cross-lingual acoustic modeling in the case when there are limited target language resources. We build on an approach in which a subspace Gaussian mixture model (SGMM) is adapted to the target language by reusing the globally shared parameters estimated from out-of-language training data. In current cross-lingual systems, these parameters are fixed when training the target system, which can give rise to a mismatch between the source and target systems. We investigate a maximum a posteriori (MAP) adaptation approach to alleviate the potential mismatch. In particular, we focus on the adaptation of phonetic subspace parameters using a matrix variate Gaussian prior distribution. Experiments on the GlobalPhone corpus using the MAP adaptation approach results in word error rate reductions, compared with the cross-lingual base-line systems and systems updated using maximum likelihood, for training conditions with 1 hour and 5 hours of target language data.
Keywords
Gaussian processes; hidden Markov models; maximum likelihood estimation; speech recognition; GlobalPhone corpus; HMM-GMM; MAP adaptation approach; SGMM; cross-lingual acoustic modeling; cross-lingual baseline systems; cross-lingual speech recognition; global shared parameter estimation; maximum a posteriori adaptation; out-of-language training data; subspace Gaussian mixture models; target language resources; word error rate reductions; Acoustics; Adaptation models; Covariance matrix; Hidden Markov models; Mathematical model; Speech recognition; Training data; Cross-lingual Speech Recognition; Maximum a Posteriori Adaptation; Subspace Gaussian Mixture Model;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location
Kyoto
ISSN
1520-6149
Print_ISBN
978-1-4673-0045-2
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2012.6289012
Filename
6289012
Link To Document