DocumentCode :
1693866
Title :
Multi-distribution deep belief network for speech synthesis
Author :
Shiyin Kang ; Xiaojun Qian ; Meng, Hsiang-Yun
Author_Institution :
Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Hong Kong, China
fYear :
2013
Firstpage :
8012
Lastpage :
8016
Abstract :
Deep belief network (DBN) has been shown to be a good generative model in tasks such as hand-written digit image generation. Previous work on DBN in the speech community mainly focuses on using the generatively pre-trained DBN to initialize a discriminative model for better acoustic modeling in speech recognition (SR). To fully utilize its generative nature, we propose to model the speech parameters including spectrum and F0 simultaneously and generate these parameters from DBN for speech synthesis. Compared with the predominant HMM-based approach, objective evaluation shows that the spectrum generated from DBN has less distortion. Subjective results also confirm the advantage of the spectrum from DBN, and the overall quality is comparable to that of context-independent HMM.
Keywords :
belief networks; handwriting recognition; hidden Markov models; speech recognition; speech synthesis; DBN; HMM-based approach; SR; acoustic modeling; handwritten digit image generation; multidistribution deep belief network; speech community; speech parameters; speech recognition; speech synthesis; Acoustics; Hidden Markov models; Speech; Speech recognition; Speech synthesis; Training; Deep belief network; Speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6639225
Filename :
6639225
Link To Document :
بازگشت