Title :
Harmonic envelope prediction for realistic speech synthesis using kernel interpolation
Author :
Fournier, Pierre-Alexandre ; Brault, Jean-Jules
Author_Institution :
Dept. of Electr. Eng., Ecole Polytech. de Montreal, Que., Canada
fDate :
July 31 2005-Aug. 4 2005
Abstract :
Harmonic and noise diphone concatenation is a proven method to obtain high-quality speech synthesis, but cannot be used when the basis corpus does not contain all the diphones needed. We propose a method to complete an individual´s corpus using examples from other corpora. Parametrisation of five vowels from different speakers is done with an harmonic and noise model (HNM). We use multi-frame analysis (MFA) and smoothing kernels to estimate the harmonic power spectrum envelopes. Different kernels are compared to predict the harmonic envelopes of vowels using training data. We use euclidian distance to measure similarity between the real envelopes and the predicted ones. Synthesis of the interpolated vowels are then performed using learned optimal parameters. Our results show Gaussian kernels can achieve a 1.8 dB (34.4%) reduction of harmonic distorsion compared to the mean harmonic envelope estimator. As far as we know, there is no other literature on phoneme prediction for realistic speech synthesis.
Keywords :
geometry; harmonic analysis; interpolation; speech synthesis; 1.8 dB; Gaussian kernels; euclidian distance; harmonic power spectrum envelopes; interpolated vowels; kernel interpolation; multi-frame analysis; noise diphone concatenation; realistic speech synthesis; Acoustic noise; Harmonic analysis; Interpolation; Kernel; Linear predictive coding; Power system harmonics; Smoothing methods; Speech coding; Speech synthesis; Training data;
Conference_Titel :
Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-7803-9048-2
DOI :
10.1109/IJCNN.2005.1556217