Title :
An investigation of implementation and performance analysis of DNN based speech synthesis system
Author :
Zhehuai Chen ; Kai Yu
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
Abstract :
Deep Neural Network (DNN), which can model hierarchical and complex relationship between input and output layer has recently been applied in speech synthesis. However, it is remained uncertain why DNN outperform traditional HMM-based synthesis. This paper describes several implementation details of DNN-based speech synthesis system and compares different impacting factors, e.g, F0 modeling method and adding BAP feature. DNN-based system are further investigated and in particular Continuous F0 HMM (CF-HMM) is taken as the baseline to compare with DNN-based system, as it has more similar input and output features with DNN-based system. Results show the ability of F0 modelling is similar between two systems, while CF-HMM system performs better. It seems that CF-HMM carefully strengthens the model by many technology, while using DNN to model F0 is still rough and needs more research. Another experiment shows that CF-HMM also does better in mcep modelling which needs to be further investigated.
Keywords :
hidden Markov models; neural nets; speech synthesis; CF-HMM; Continuous F0 HMM; DNN based speech synthesis system; HMM based synthesis; Hidden Markov Model; deep neural network; performance analysis; speech synthesis system; Acoustics; Data models; Hidden Markov models; Speech; Speech synthesis; Training; Vectors; CF-HMM; DNN; MSD-HMM; Speech Synthesis;
Conference_Titel :
Signal Processing (ICSP), 2014 12th International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4799-2188-1
DOI :
10.1109/ICOSP.2014.7015070