Speech Synthesis Based on Gaussian Conditional Random Fields

Author/Authors

Soheil Khorram Department of Computer Engineering - Sharif University of Technology, Tehran-Iran , Fahimeh Bahmaninezhad Department of Computer Engineering - Sharif University of Technology, Tehran-Iran , Hossein Sameti Department of Computer Engineering - Sharif University of Technology, Tehran-Iran

كليدواژه

HSMM extension , statistical parametric speech synthesis , Gaussian conditional random field

سال انتشار

1392

عنوان كنفرانس

همايش بين المللي هوش مصنوعي و پردازش سيگنال

زبان مدرك

لاتين

چكيده لاتين

Hidden Markov Model (HMM)-based synthesis (HTS) has recently been confirmed to be the most effective method in generating natural speech. However, it lacks adequate context generalization when the training data is limited. As a solution, current study provides a new context-dependent speech modeling framework based on the Gaussian Conditional Random Field (GCRF) theory. By applying this model, an innovative speech synthesis system has been developed which can be viewed as an extension of Context-Dependent Hidden Semi Markov Model (CD-HSMM). A novel Viterbi decoder along with a stochastic gradient ascent algorithm was applied to train model parameters. Also, a fast and efficient parameter generation algorithm was derived for the synthesis part. Experimental results using objective and subjective criteria have shown that the proposed system outperforms HSMM substantially in limited speech databases. Moreover, Mel-cepstral distance of the spectral parameters has been reduced considerably for any size of training database.

كشور

ايران

تعداد صفحه 2

از صفحه

تا صفحه

لينک به اين مدرک

https://search.isc.ac/dl/search/defaultta.aspx?DTC=36&DC=276786