مرکز منطقه ای اطلاع رساني علوم و فناوري - Applications of Dirichlet Process Mixtures to speaker adaptation

DocumentCode :

3163128

Title :

Applications of Dirichlet Process Mixtures to speaker adaptation

Author :

Torbati, Amir Hossein Harati Nejad ; Picone, Joe ; Sobel, Marc

Author_Institution :

Dept. of Electr. & Comp. Eng., Temple Univ., Philadelphia, PA, USA

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4321

Lastpage :

4324

Abstract :

Balancing unique acoustic characteristics of a speaker such as identity and accent, with general acoustic behavior that describes phoneme identity, is one of the great challenges in applying nonparametric Bayesian approaches to speaker adaptation. The Dirichlet Process Mixture (DPM) is a relatively new model that provides an elegant framework in which individual characteristics can be balanced with aggregate behavior without diluting the quality of the individual models. Unlike Gaussian Mixture models (GMMs), which tend to smear multimodal behavior through averaging, the DPM model attempts to preserve unique behaviors through use of an infinite mixture model. In this paper, we present some exploratory research on applying these models to the acoustic modeling component of the speaker adaptation problem. DPM based models are shown to provide up to 10% reduction in WER over maximum likelihood linear regression (MLLR) on a speaker adaptation task based on the Resource Management database.

Keywords :

Bayes methods; maximum likelihood estimation; regression analysis; speaker recognition; Dirichlet process mixtures; aggregate behavior; elegant framework; general acoustic behavior; infinite mixture; maximum likelihood linear regression; nonparametric Bayesian approach; phoneme identity; resource management database; smear multimodal behavior; speaker adaptation; unique acoustic characteristics; Adaptation models; Clustering algorithms; Computational modeling; Hidden Markov models; Inference algorithms; Mathematical model; Regression tree analysis; Dirichlet Process Mixture; nonparametric Bayesian models; speaker adaptation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6288875

Filename :

6288875

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3163128