Title :
An all-phoneme ergodic HMM for unsupervised speaker adaptation
Author :
Miyazawa, Yasvnaga
Author_Institution :
ATR Interpreting Telephony Res. Lab., Soraku-gun, Kyoto, Japan
Abstract :
The author proposes an all-phoneme ergodic HMM (hidden Markov model) that incorporates stochastic language constraints in unsupervised speaker adaptation. The proposed model consists of all-phoneme HMMs and interphoneme probabilities. It can be regarded as a rather large single ergodic HMM containing hidden states of all phonemes as well as intraphoneme and interphoneme transition probabilities. Since this model is a model of arbitrarily spoken words, the standard Baum-Welch reestimation algorithm can be used to train the whole ergodic model. In the experiments, only mean vectors of the state output probability densities are reestimated, and a vector field smoothing algorithm is used to enhance the statistical reliability. The proposed method was tested on phoneme and phrase recognition experiments with male reference and input speakers. A better performance than with the speaker-independent case was attained by using adaptation data shorter than three minutes.<>
Keywords :
adaptive systems; constraint handling; hidden Markov models; reliability; speech recognition; stochastic systems; unsupervised learning; Baum-Welch reestimation algorithm; all-phoneme ergodic HMM; hidden Markov model; interphoneme probabilities; performance; phoneme recognition; phrase recognition; state output probability densities; statistical reliability; stochastic language constraints; unsupervised speaker adaptation; vector field smoothing algorithm;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.1993.319372