• DocumentCode
    3529895
  • Title

    Refactoring acoustic models using variational density approximation

  • Author

    Dognin, Pierre L. ; Hershey, John R. ; Goel, Vaibhava ; Olsen, Peder A.

  • Author_Institution
    IBM T. J. Watson Res. Center, Yorktown Heights, NY
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4473
  • Lastpage
    4476
  • Abstract
    In model-based pattern recognition it is often useful to change the structure, or refactor, a model. For example, we may wish to find a Gaussian mixture model (GMM) with fewer components that best approximates a reference model. One application for this arises in speech recognition, where a variety of model size requirements exists for different platforms. Since the target size may not be known a priori, one strategy is to train a complex model and subsequently derive models of lower complexity. We present methods for reducing model size without training data, following two strategies: GMM-approximation and Gaussian clustering based on divergences. A variational expectation-maximization algorithm is derived that unifies these two approaches. The resulting algorithms reduce the model size by 50% with less than 4% increase in error rate relative to the same-sized model trained on data. In fact, for up to 35% reduction in size, the algorithms can improve accuracy relative to baseline.
  • Keywords
    Gaussian processes; acoustic signal processing; approximation theory; expectation-maximisation algorithm; pattern recognition; GMM-approximation; Gaussian clustering; model-based pattern recognition; speech recognition; variational density approximation; variational expectation-maximization algorithm; Acoustic applications; Automatic speech recognition; Clustering algorithms; Context modeling; Error analysis; Merging; Pattern recognition; Probability density function; Speech recognition; Training data; Acoustic model clustering; Bhattacharyya divergence; KL divergence; variational approximations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960623
  • Filename
    4960623