Title :
Multi-level prosody and spectrum conversion for emotional speech synthesis
Author :
Zexun Wang ; Yibiao Yu
Author_Institution :
Sch. of Electron. & Inf. Eng., Soochow Univ., Suzhou, China
Abstract :
Emotional speech can be synthesized by converting prosodic and spectrum features in neutral speech. This paper propose a multi-level prosody conversion method, it converts three prosodic features of F0, short-time energy and speaking rate in syllable, prosodic word and sentence level sequentially. The F0 and speaking rate is modeled by Gaussians, and energy is modeled by Gamma distribution respectively. The experiments both of objective and subjective evaluation test show that proposed method is effective for emotion conversion of speech.
Keywords :
emotion recognition; feature extraction; gamma distribution; speech synthesis; Gamma distribution; emotion conversion; emotional speech synthesis; energy model; multilevel prosody conversion method; objective evaluation test; prosodic feature; prosodic word; sentence level; spectrum conversion; spectrum feature; subjective evaluation test; syllable; Analytical models; Educational institutions; Media; Speech; Speech synthesis; Statistical analysis; Vectors; Multi-level prosody; emotion conversion; emotional speech synthesis;
Conference_Titel :
Signal Processing (ICSP), 2014 12th International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4799-2188-1
DOI :
10.1109/ICOSP.2014.7015072