DocumentCode :
3421889
Title :
On the state definition for a trainable excitation model in HMM-based speech synthesis
Author :
Maia, R. ; Toda, T. ; Tokuda, K. ; Sakai, S. ; Nakamura, S.
Author_Institution :
Nat. Inst. of Inform, Koganei
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
3965
Lastpage :
3968
Abstract :
One of the issues of speech synthesizers based on hidden Markov models concerns the vocoded quality of the synthesized speech. From the principle of analysis-by-synthesis speech coders a trainable excitation model has been proposed to improve naturalness, where the method consists in the design of a set of state-dependent filters in a way to minimize the distortion between residual and synthetic excitation. Although this approach seems successful, state definition still represents an open issue. This paper describes a method for state definition wherein bottom-up clustering is performed on full context decision trees, using the likelihood of the residual database as merging criterion. Experiments have shown that improvement on residual modeling through better filter design can be achieved.
Keywords :
hidden Markov models; speech coding; speech synthesis; analysis-by-synthesis speech coders; hidden Markov models; speech synthesis; state-dependent filters; trainable excitation model; Databases; Decision trees; Filters; Hidden Markov models; Merging; Natural languages; Signal synthesis; Speech analysis; Speech synthesis; Synthesizers; Speech processing; digital filters; hidden Markov models; speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518522
Filename :
4518522
Link To Document :
بازگشت