Title :
An optimized neural network based prosody model of Chinese speech synthesis system
Author :
Tao, Jianhua ; Cai, Lianhong ; Tropf, Herbert
Author_Institution :
Dept. of Comput. Sci., Tsinghua Univ., Beijing, China
Abstract :
To generate a pitch contour in high quality is a very important issue for each TTS system. Until now, the naturalness of it is still far from being satisfactory. In this paper, a trainable prosody model, based on a neural network, is described for a Mandarin TTS system. Extensive tests show that the structure of the neural network characterizes the Mandarin prosody more accurately than traditional models. The naturalness of the result has been improved a lot and the system performs more flexibly in practice. Furthermore, personal and task specific characteristics are also maintained. The paper adopts a fuzzy clustering algorithm in classifying the pitch contours of the Mandarin syllables. The algorithm has been proved very useful for optimizing the neural network and making it suitable to deal with the pitch contours of Mandarin.
Keywords :
fuzzy neural nets; learning (artificial intelligence); optimisation; pattern classification; pattern clustering; speech processing; speech synthesis; Chinese speech synthesis system; Mandarin TTS system; Mandarin syllables; fuzzy clustering algorithm; naturalness; optimized neural network; pitch contour classification; pitch contour generation; prosody model; trainable prosody model; Clustering algorithms; Computer science; Databases; Fuzzy neural networks; Fuzzy systems; Natural languages; Neural networks; Speech processing; Speech synthesis; Testing;
Conference_Titel :
TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering
Print_ISBN :
0-7803-7490-8
DOI :
10.1109/TENCON.2002.1181317