DocumentCode :
2838749
Title :
A superposed prosodic model for Chinese text-to-speech synthesis
Author :
Chen, Gao-Peng ; Bailly, G. ; Liu, Qing-Feng ; Wang, Ren-Hua
fYear :
2004
fDate :
15-18 Dec. 2004
Firstpage :
177
Lastpage :
180
Abstract :
The paper presents the application of the trainable SFC superpositional prosodic model to Chinese. Within the SFC model, prosodic parameters (F0, syllabic lengthening) are interpreted as the superposition of overlapping multiparametric contours. These contours are associated with high-level prosodic features operating at different scopes, such as tones, stress, prosodic boundary, part of speech of words, etc. Each feature label corresponds to a metalinguistic function (morphological, lexical, syntactic, attitudinal, etc.) which is represented by a neural network. The observed contour is the sum of the outputs of the corresponding neural networks. An analysis-by-synthesis scheme is implemented for automatic learning. This model works well in the concatenation of neighbored units. The RMSE of F0 prediction is 2.34 st (referenced to 200 Hz), correlation is 0.86. Perceptual experiments show that the predicted prosody is quite appropriate and fluent.
Keywords :
learning (artificial intelligence); linguistics; natural language interfaces; neural nets; speech synthesis; text analysis; Chinese text-to-speech synthesis; analysis-by-synthesis; correlation; learning; metalinguistic function; neural network; overlapping multiparametric contour superposition; part of speech; prosodic boundary; prosodic parameters; stress; superposed prosodic model; syllabic lengthening; tones; Discrete event simulation; Encoding; Natural languages; Neural networks; Proposals; Prototypes; Shape; Speech synthesis; Stress;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing, 2004 International Symposium on
Print_ISBN :
0-7803-8678-7
Type :
conf
DOI :
10.1109/CHINSL.2004.1409615
Filename :
1409615
Link To Document :
بازگشت