DocumentCode :
118054
Title :
HMM-based Thai speech synthesis using unsupervised stress context labeling
Author :
Moungsri, Decha ; Koriyama, Tomoki ; Kobayashi, Takao
Author_Institution :
Interdiscipl. Grad. Sch. of Sci. & Eng., Tokyo Inst. of Technol., Tokyo, Japan
fYear :
2014
fDate :
9-12 Dec. 2014
Firstpage :
1
Lastpage :
4
Abstract :
This paper describes an approach to HMM-based Thai speech synthesis using stress context. It has been shown that context related to stressed/unstressed syllable information (stress context) significantly improves the tone correctness of the synthetic speech, but there is a problem of requiring a manual context labeling process in tone modeling. To reduce costs for the stress context labeling, we propose an unsupervised technique for automatic labeling based on the characteristics of Thai stressed syllables, namely, having high FO movement and long duration. In the proposed technique, we use log FO variance and duration of each syllable to classify it into one of stress-related context classes. Objective and subjective evaluation results show that the proposed context labeling gives comparable performance to that conducted carefully by a human in terms of tone naturalness of synthetic speech.
Keywords :
hidden Markov models; natural language processing; speech synthesis; unsupervised learning; HMM-based Thai speech synthesis; cost reduction; hidden Markov model; high F0 movement; log F0 variance; manual context labeling process; stressed syllable information; synthetic speech tone correctness improvement; tone modeling; tone naturalness; unstressed syllable information; unsupervised stress context labeling; Context; Hidden Markov models; Labeling; Manuals; Speech; Speech synthesis; Stress;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
Conference_Location :
Siem Reap
Type :
conf
DOI :
10.1109/APSIPA.2014.7041599
Filename :
7041599
Link To Document :
بازگشت