DocumentCode :
3163128
Title :
Applications of Dirichlet Process Mixtures to speaker adaptation
Author :
Torbati, Amir Hossein Harati Nejad ; Picone, Joe ; Sobel, Marc
Author_Institution :
Dept. of Electr. & Comp. Eng., Temple Univ., Philadelphia, PA, USA
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
4321
Lastpage :
4324
Abstract :
Balancing unique acoustic characteristics of a speaker such as identity and accent, with general acoustic behavior that describes phoneme identity, is one of the great challenges in applying nonparametric Bayesian approaches to speaker adaptation. The Dirichlet Process Mixture (DPM) is a relatively new model that provides an elegant framework in which individual characteristics can be balanced with aggregate behavior without diluting the quality of the individual models. Unlike Gaussian Mixture models (GMMs), which tend to smear multimodal behavior through averaging, the DPM model attempts to preserve unique behaviors through use of an infinite mixture model. In this paper, we present some exploratory research on applying these models to the acoustic modeling component of the speaker adaptation problem. DPM based models are shown to provide up to 10% reduction in WER over maximum likelihood linear regression (MLLR) on a speaker adaptation task based on the Resource Management database.
Keywords :
Bayes methods; maximum likelihood estimation; regression analysis; speaker recognition; Dirichlet process mixtures; aggregate behavior; elegant framework; general acoustic behavior; infinite mixture; maximum likelihood linear regression; nonparametric Bayesian approach; phoneme identity; resource management database; smear multimodal behavior; speaker adaptation; unique acoustic characteristics; Adaptation models; Clustering algorithms; Computational modeling; Hidden Markov models; Inference algorithms; Mathematical model; Regression tree analysis; Dirichlet Process Mixture; nonparametric Bayesian models; speaker adaptation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6288875
Filename :
6288875
Link To Document :
بازگشت