• DocumentCode
    3166017
  • Title

    Maximum a posteriori adaptation of subspace Gaussian mixture models for cross-lingual speech recognition

  • Author

    Lu, Liang ; Ghoshal, Arnab ; Renals, Steve

  • Author_Institution
    Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    4877
  • Lastpage
    4880
  • Abstract
    This paper concerns cross-lingual acoustic modeling in the case when there are limited target language resources. We build on an approach in which a subspace Gaussian mixture model (SGMM) is adapted to the target language by reusing the globally shared parameters estimated from out-of-language training data. In current cross-lingual systems, these parameters are fixed when training the target system, which can give rise to a mismatch between the source and target systems. We investigate a maximum a posteriori (MAP) adaptation approach to alleviate the potential mismatch. In particular, we focus on the adaptation of phonetic subspace parameters using a matrix variate Gaussian prior distribution. Experiments on the GlobalPhone corpus using the MAP adaptation approach results in word error rate reductions, compared with the cross-lingual base-line systems and systems updated using maximum likelihood, for training conditions with 1 hour and 5 hours of target language data.
  • Keywords
    Gaussian processes; hidden Markov models; maximum likelihood estimation; speech recognition; GlobalPhone corpus; HMM-GMM; MAP adaptation approach; SGMM; cross-lingual acoustic modeling; cross-lingual baseline systems; cross-lingual speech recognition; global shared parameter estimation; maximum a posteriori adaptation; out-of-language training data; subspace Gaussian mixture models; target language resources; word error rate reductions; Acoustics; Adaptation models; Covariance matrix; Hidden Markov models; Mathematical model; Speech recognition; Training data; Cross-lingual Speech Recognition; Maximum a Posteriori Adaptation; Subspace Gaussian Mixture Model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6289012
  • Filename
    6289012