DocumentCode
134220
Title
Integrating global variance of log power spectrum derived from LSPs into MGE training for HMM-based parametric speech synthesis
Author
Yu-Sheng Sun ; Zhen-Hua Ling ; Xiang Yin ; Li-Rong Dai
Author_Institution
Nat. Eng. Lab. for Speech & Language Inf. Process., Univ. of Sci. & Technol. of China, Hefei, China
fYear
2014
fDate
12-14 Sept. 2014
Firstpage
201
Lastpage
205
Abstract
This paper presents a method to improve hidden Markov model (HMM) based parametric speech synthesis by integrating global variance (GV) of log power spectrum (LPS) derived from line spectral pairs (LSPs) into minimum generation error (MGE) model training. In order to alleviate the over-smoothing effect of the generated spectral structures, an LPS-GV based parameter generation method has been proposed. This method improved the naturalness of synthetic speech when LSPs were used as spectral features. However, it increased the complexity of parameter generation at synthesis time significantly. In this paper, we propose a method to integrate the distortions of LPS-GV derived from LSPs into the criterion of MGE model training in order to utilize LPSGV information at training time instead of at synthesis time. The experimental results show that this proposed method can achieve better naturalness of synthetic speech than the conventional MGE model training without loss of efficiency at synthesis time when LSPs are used as spectral features.
Keywords
computational complexity; hidden Markov models; speech synthesis; HMM-based parametric speech synthesis; LPS-GV based parameter generation method; LSP; MGE training; global variance; hidden Markov model; line spectral pairs; log power spectrum; minimum generation error model training; parameter generation complexity; spectral features; synthesis time; synthetic speech; Acoustic distortion; Acoustics; Hidden Markov models; Speech; Speech synthesis; Training; Vectors; Speech synthesis; global variance; hidden Markov model; line spectral pairs; log power spectrum;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location
Singapore
Type
conf
DOI
10.1109/ISCSLP.2014.6936612
Filename
6936612
Link To Document