DocumentCode
2019751
Title
A superpositional model applied to F0 parameterization using DCT for text-to-speech synthesis
Author
Stan, Adriana ; Giurgiu, Mircea
Author_Institution
Commun. Dept., Tech. Univ. of Cluj-Napoca, Cluj-Napoca, Romania
fYear
2011
fDate
18-21 May 2011
Firstpage
1
Lastpage
6
Abstract
This paper addresses the idea of the superpositional model based on the DCT (Discrete Cosine Transform) parameterization of the F0 contours. We examine the capacity of the DCT coefficients to estimate the fast variations in the F0 contour at syllable level and also the overall trend of the phrase level. The method determines the coefficients at syllable level, based on the subtraction of the estimated phrase level contour from the original one; thus considering that the syllable has an additive prosodic effect over the phrase level. We also compare the use of 3 different decision and regression tree algorithms for DCT coefficients clustering and prediction. Additional features are selected based on a greedy stepwise without backtracking feature selection method. The results support the proposed method through low average square errors and little or no perceivable errors in the synthesized speech.
Keywords
decision trees; discrete cosine transforms; regression analysis; speech synthesis; DCT; decision tree; discrete cosine transform; regression tree algorithm; superpositional model; text to speech synthesis; Decision trees; Discrete cosine transforms; Feature extraction; Prediction algorithms; Speech; Stress; Training; DCT; F0 modelling; pitch; prosody;
fLanguage
English
Publisher
ieee
Conference_Titel
Speech Technology and Human-Computer Dialogue (SpeD), 2011 6th Conference on
Conference_Location
Brasov
Print_ISBN
978-1-4577-0440-6
Type
conf
DOI
10.1109/SPED.2011.5940734
Filename
5940734
Link To Document