DocumentCode :
417230
Title :
A real-time Cantonese text-to-audiovisual speech synthesizer
Author :
Wang, Jian-Qing ; Wong, Ka-Ho ; Heng, Pheng-Ann ; Meng, Helen M. ; Wong, Tien-Tsin
Author_Institution :
Dept. of Comput. Sci. & Eng., Chinese Univ. of Hong Kong, China
Volume :
1
fYear :
2004
fDate :
17-21 May 2004
Abstract :
This paper describes the design and development of a Cantonese TTVS synthesizer, which can generate highly natural synthetic speech that is precisely time-synchronized with a real-time 3D face rendering. Our Cantonese TTVS synthesizer utilizes a homegrown Cantonese syllable-based concatenative text-to-speech system named CU VOCAL. This paper describes the extension of CU VOCAL to output syllable labels and durations that correspond to the output acoustic wave file. The syllables are decomposed and their initials/finals are mapped to the nearest IPA symbols that correspond to static viseme models. We have authored sixteen static viseme models together with two emotion-based face models. In order to achieve 3D face rendering, we have designed and implemented a blending technique that computes the linear combinations of the static face models to effect smooth transitions in between models. We demonstrate that this design and implementation of a TTVS synthesizer can achieve real-time performance in generation.
Keywords :
real-time systems; rendering (computer graphics); speech synthesis; synchronisation; CU VOCAL; Cantonese text-to-audiovisual speech; IPA symbols; TTVS synthesizer; acoustic wave file; blending technique; concatenative text-to-speech system; durations; emotion-based face models; highly natural synthetic speech; real-time 3D face rendering; real-time speech synthesizer; static viseme models; syllable labels; text-to-audiovisual speech synthesizer; time-synchronization; Design engineering; Facial animation; Financial advantage program; Head; Hidden Markov models; Real time systems; Speech synthesis; Synthesizers; Virtual reality; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1326070
Filename :
1326070
Link To Document :
بازگشت