Title :
All-phoneme ergodic hidden Markov network for unsupervised speaker adaptation
Author :
Miyazawa, Yasunaga ; Takami, Jun-Ichi ; Sagayama, Shigeki ; Matsunaga, Shoichi
Author_Institution :
ATR Interpreting Telephony Res. Labs., Kyoto, Japan
Abstract :
The paper proposes an unsupervised speaker adaptation method using “all-phoneme ergodic hidden Markov network” that combines allophonic (context-dependent phone) acoustic models with stochastic language constraints. Hidden Markov networks (HMnet) for allophone modeling and allophonic bigram probabilities derived from a large text database are combined to yield a single large ergodic HMM which represents arbitrary speech signals in a particular language so that the model parameters can be re-estimated using text-unknown speech samples with the Baum-Welch algorithm. Combined with the vector field smoothing (VFS) technique, unsupervised speaker adaptation can be effectively performed. This method experimentally gave fairly better performances compared with the authors´ previous unsupervised adaptation method using conventional phonetic HMMs and phoneme bigram probabilities
Keywords :
hidden Markov models; neural nets; smoothing methods; speech recognition; unsupervised learning; Baum-Welch algorithm; Japanese phrases; all-phoneme ergodic hidden Markov network; allophonic acoustic models; bigram probabilities; context-dependent phone acoustic models; language; speech signals; stochastic language constraints; text database; text-unknown speech samples; unsupervised speaker adaptation; vector field smoothing; Context modeling; Feedback; Hidden Markov models; Loudspeakers; Markov random fields; Natural languages; Speech analysis; Stochastic processes; Stochastic systems; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location :
Adelaide, SA
Print_ISBN :
0-7803-1775-0
DOI :
10.1109/ICASSP.1994.389308