• DocumentCode
    3163128
  • Title

    Applications of Dirichlet Process Mixtures to speaker adaptation

  • Author

    Torbati, Amir Hossein Harati Nejad ; Picone, Joe ; Sobel, Marc

  • Author_Institution
    Dept. of Electr. & Comp. Eng., Temple Univ., Philadelphia, PA, USA
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    4321
  • Lastpage
    4324
  • Abstract
    Balancing unique acoustic characteristics of a speaker such as identity and accent, with general acoustic behavior that describes phoneme identity, is one of the great challenges in applying nonparametric Bayesian approaches to speaker adaptation. The Dirichlet Process Mixture (DPM) is a relatively new model that provides an elegant framework in which individual characteristics can be balanced with aggregate behavior without diluting the quality of the individual models. Unlike Gaussian Mixture models (GMMs), which tend to smear multimodal behavior through averaging, the DPM model attempts to preserve unique behaviors through use of an infinite mixture model. In this paper, we present some exploratory research on applying these models to the acoustic modeling component of the speaker adaptation problem. DPM based models are shown to provide up to 10% reduction in WER over maximum likelihood linear regression (MLLR) on a speaker adaptation task based on the Resource Management database.
  • Keywords
    Bayes methods; maximum likelihood estimation; regression analysis; speaker recognition; Dirichlet process mixtures; aggregate behavior; elegant framework; general acoustic behavior; infinite mixture; maximum likelihood linear regression; nonparametric Bayesian approach; phoneme identity; resource management database; smear multimodal behavior; speaker adaptation; unique acoustic characteristics; Adaptation models; Clustering algorithms; Computational modeling; Hidden Markov models; Inference algorithms; Mathematical model; Regression tree analysis; Dirichlet Process Mixture; nonparametric Bayesian models; speaker adaptation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6288875
  • Filename
    6288875