Title :
Support vector regression fusion scheme in phone duration modeling
Author :
Lazaridis, Alexandros ; Mporas, Iosif ; Ganchev, Todor ; Fakotakis, Nikos
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Patras, Rion, Greece
Abstract :
A fusion scheme of phone duration models (PDMs) is presented in this work. Specifically, a support vector regression (SVR)-fusion model is fed with the predictions of a group of independent PDMs operating in parallel. The American-English KED TIMIT and the Greek WCL-1 databases are used for evaluating the PDMs and the fusion scheme. The fusion scheme contributes to the accuracy improvement over the best individual model, achieving a relative reduction of the mean absolute error (MAE) and the root mean square error (RMSE), by 1.9% and 2.0% on KED TLVHT, and 2.6% and 1.8% respectively on WCL-1. Moreover, for evaluating the impact the accuracy improvement will have on synthetic speech, perceptual evaluation test was performed. This test showed that the accuracy improvement achieved by the SVR-fusion would contribute to the improvement of the naturalness of synthetic speech.
Keywords :
mean square error methods; speech synthesis; support vector machines; American-English KED TΠVΠT; Greek WCL-1 databases; MAE; PDM; RMSE; SVR-fusion model; mean absolute error; phone duration modeling; root mean square error; support vector regression fusion; synthetic speech; Accuracy; Databases; Hidden Markov models; Prediction algorithms; Predictive models; Speech; Training; Phone duration modeling; speech synthesis; statistical modeling; text-to-speech;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947412