مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-distribution deep belief network for speech synthesis

DocumentCode :

1693866

Title :

Multi-distribution deep belief network for speech synthesis

Author :

Shiyin Kang ; Xiaojun Qian ; Meng, Hsiang-Yun

Author_Institution :

Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Hong Kong, China

fYear :

2013

Firstpage :

8012

Lastpage :

8016

Abstract :

Deep belief network (DBN) has been shown to be a good generative model in tasks such as hand-written digit image generation. Previous work on DBN in the speech community mainly focuses on using the generatively pre-trained DBN to initialize a discriminative model for better acoustic modeling in speech recognition (SR). To fully utilize its generative nature, we propose to model the speech parameters including spectrum and F0 simultaneously and generate these parameters from DBN for speech synthesis. Compared with the predominant HMM-based approach, objective evaluation shows that the spectrum generated from DBN has less distortion. Subjective results also confirm the advantage of the spectrum from DBN, and the overall quality is comparable to that of context-independent HMM.

Keywords :

belief networks; handwriting recognition; hidden Markov models; speech recognition; speech synthesis; DBN; HMM-based approach; SR; acoustic modeling; handwritten digit image generation; multidistribution deep belief network; speech community; speech parameters; speech recognition; speech synthesis; Acoustics; Hidden Markov models; Speech; Speech recognition; Speech synthesis; Training; Deep belief network; Speech synthesis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6639225

Filename :

6639225

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1693866