Title :
Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis
Author :
Yamagishi, Junichi ; Tachibana, Makoto ; Masuko, Takashi ; Kobayashi, Takao
Author_Institution :
Interdisciplinary Graduate Sch. of Sci. & Eng., Tokyo Inst. of Technol., Japan
Abstract :
This paper describes an MLLR-based speaking style adaptation technique for HMM-based speech synthesis. Since speaking styles and emotional expressions are characterized by many suprasegmental features as well as segmental features, it is necessary to adapt suprasegmental features for speaking style adaptation. To achieve suprasegmental feature adaptation, we utilize context clustering decision trees, which are constructed in the training stage, for tying of regression matrices. Using this technique, we adapt an initial "reading" style model to "joyful" or "sad" styles. Experimental results show that, using 50 adaptation sentences, speech samples generated from adapted models were judged to be similar to the target speaking styles at rates of 92% and 70% for joyful and sad styles, respectively.
Keywords :
decision trees; feature extraction; hidden Markov models; matrix algebra; pattern clustering; regression analysis; speech synthesis; HMM-based speech synthesis; MLLR-based technique; context clustering decision tree; regression matrices; speaking style adaptation; suprasegmental feature adaptation; Adaptation model; Context modeling; Databases; Decision trees; Hidden Markov models; Maximum likelihood linear regression; Probability distribution; Regression tree analysis; Speech synthesis; Stress;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1325908