• DocumentCode
    865508
  • Title

    Discriminative Estimation of Subspace Constrained Gaussian Mixture Models for Speech Recognition

  • Author

    Axelrod, Scott ; Goel, Vaibhava ; Gopinath, Ramesh ; Olsen, Peder ; Visweswariah, Karthik

  • Author_Institution
    Falcon Manage. Corp., Wyckoff, NJ
  • Volume
    15
  • Issue
    1
  • fYear
    2007
  • Firstpage
    172
  • Lastpage
    189
  • Abstract
    In this paper, we study discriminative training of acoustic models for speech recognition under two criteria: maximum mutual information (MMI) and a novel "error-weighted" training technique. We present a proof that the standard MMI training technique is valid for a very general class of acoustic models with any kind of parameter tying. We report experimental results for subspace constrained Gaussian mixture models (SCGMMs), where the exponential model weights of all Gaussians are required to belong to a common "tied" subspace, as well as for subspace precision and mean (SPAM) models which impose separate subspace constraints on the precision matrices (i.e., inverse covariance matrices) and means. It has been shown previously that SCGMMs and SPAM models generalize and yield significant error rate improvements over previously considered model classes such as diagonal models, models with semitied covariances, and extended maximum likelihood linear transformation (EMLLT) models. We show here that MMI and error-weighted training each individually result in over 20% relative reduction in word error rate on a digit task over maximum-likelihood (ML) training. We also show that a gain of as much as 28% relative can be achieved by combining these two discriminative estimation techniques
  • Keywords
    Gaussian processes; covariance matrices; maximum likelihood estimation; speech recognition; acoustic models; error-weighted training technique; extended maximum likelihood linear transformation; inverse covariance matrices; maximum mutual information; speech recognition; subspace constrained Gaussian mixture models; subspace precision and mean models; Covariance matrix; Error analysis; Hidden Markov models; Maximum likelihood estimation; Mutual information; Speech recognition; State estimation; Subspace constraints; Training data; Unsolicited electronic mail; Covariance modeling; discriminative estimation of Gaussian mixture models (GMMs); exponential distributions; maximum-likelihood (ML) estimation; speech recognition; subspace constrained Gaussian mixture models (SCGMMs); subspace constrained exponential models;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2006.872617
  • Filename
    4032762