DocumentCode
2016709
Title
Minimum generation error training for HMM-based prediction of articulatory movements
Author
Zhao, Tian-Yi ; Ling, Zhen-Hua ; Lei, Ming ; Dai, Li-Rong ; Liu, Qing-Feng
Author_Institution
iFLYTEK Speech Lab., Univ. of Sci. & Technol. of China, Hefei, China
fYear
2010
fDate
Nov. 29 2010-Dec. 3 2010
Firstpage
99
Lastpage
102
Abstract
This paper presents a minimum generation error (MGE) training method for hidden Markov model (HMM) based prediction of articulatory movements when both text and audio inputs are given. In this method, MGE criterion is adopted to replace the maximum likelihood (ML) criterion to estimate model parameters for the unified acoustic-articulatory HMMs. Different from the MGE training for HMM-based acoustic speech synthesis, the generation error used here is defined as the distance between the generated and natural articulatory features. Experimental results show that our proposed method can improve the accuracy of articulatory movement prediction significantly. The average root mean square (RMS) error reduces from 1.002 mm to 0.913 mm on the test set.
Keywords
hidden Markov models; mean square error methods; speech synthesis; HMM based prediction; MGE; ML; RMS; acoustic speech synthesis; articulatory movements; hidden Markov model; maximum likelihood; minimum generation error training; root mean square; Acoustics; Covariance matrix; Hidden Markov models; Predictive models; Speech synthesis; Training; Transforms; articulatory features; hidden Markov model; minimum generation error training;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location
Tainan
Print_ISBN
978-1-4244-6244-5
Type
conf
DOI
10.1109/ISCSLP.2010.5684840
Filename
5684840
Link To Document