مرکز منطقه ای اطلاع رساني علوم و فناوري - Bayesian duration modeling and learning for speech recognition

DocumentCode :

3330080

Title :

Bayesian duration modeling and learning for speech recognition

Author :

Chien, Jen-Tzung ; Huang, Chih-Hsien

Author_Institution :

Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan

Volume :

fYear :

2004

fDate :

17-21 May 2004

Abstract :

We present Bayesian duration modeling and learning for speech recognition under nonstationary speaking rates and noise conditions. In this study, the Gaussian, Poisson and gamma distributions are investigated, to characterize duration models. The maximum a posteriori (MAP) estimate of the gamma duration model is developed. To exploit the sequential learning, we adopt the Poisson duration model, incorporated with gamma prior density, which belongs to the conjugate prior family. When the adaptation data are sequentially observed, the gamma posterior density is produced for twofold advantages. One is to determine the optimal quasi-Bayes (QB) duration parameter, which can be merged in HMM´s for speech recognition. The other one is to build the updating mechanism of gamma prior statistics for sequential learning. An expectation-maximization algorithm is applied to fulfill parameter estimation. In the experiments, the proposed Bayesian approaches significantly improve the speech recognition performance of Mandarin broadcast news. Batch and sequential learning are investigated for MAP and QB duration models, respectively.

Keywords :

Bayes methods; Gaussian distribution; Poisson distribution; gamma distribution; hidden Markov models; parameter estimation; speech recognition; Bayesian duration modeling; Bayesian learning; Gaussian distribution; HMM; MAP estimate; Poisson distribution; conjugate prior family; expectation-maximization algorithm; gamma distribution; gamma prior density; gamma prior statistics; maximum a posteriori estimate; nonstationary noise conditions; nonstationary speaking rates; optimal quasi-Bayes duration parameter; parameter estimation; sequential learning; speech recognition; Automatic speech recognition; Bayesian methods; Computer science; Hidden Markov models; Maximum likelihood estimation; Parameter estimation; Speech recognition; Statistics; Training data; Viterbi algorithm;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN :

1520-6149

Print_ISBN :

0-7803-8484-9

Type :

conf

DOI :

10.1109/ICASSP.2004.1326158

Filename :

1326158

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3330080