DocumentCode :
394299
Title :
Inversion of F0 model for natural-sounding speech synthesis
Author :
Rossi, Pierluigi Salvo ; Palmieri, Francesco ; Cutugno, Francesco
Author_Institution :
Dipt. di Inf. e Sistemistica, Naples Univ., Italy
Volume :
1
fYear :
2003
fDate :
6-10 April 2003
Abstract :
Natural-sounding speech synthesizers require information from a model quantitatively describing prosody. H. Fujisaki\´s model (see "Dynamic Characteristics of Voice Fundamental Frequency in Speech and Singing", The Production of Speech, Springer-Verlag, p.39-47, 1983) has shown considerable accuracy on many languages (Fujisaki et al., IEEE Int. Conf. on Acoustics, Speech and Sig. Processing, vol.2, p.211-14, 1993; Fujisaki and Ohno, S., Fourth Int. Conf. on Sig. Processing, vol.1, p.714-17,1998). We propose a method for the estimation of Fujisaki\´s model parameters, i.e., inversion methods, based on the relative extremes of the pitch contour and a gradient algorithm refinement procedure. Preliminary results show excellent performance of the proposed method in matching the pitch contours. Preliminary results of synthesis making use of the obtained features are very encouraging.
Keywords :
feature extraction; gradient methods; natural languages; parameter estimation; speech synthesis; F0 model inversion; Italian continuous speech; fundamental frequency; gradient algorithm refinement procedure; inversion methods; model feature extraction; natural-sounding speech synthesis; parameter estimation; pitch contour; prosody; Feature extraction; Filtering; Filters; Fluctuations; Inverse problems; Mean square error methods; Solids; Speech synthesis; Testing; Timing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1198832
Filename :
1198832
Link To Document :
بازگشت