DocumentCode
475390
Title
Automatic duration weighting in Thai unit-selection speech synthesis
Author
Saychum, S. ; Rugchatjaroen, A. ; Thatphithakkul, N. ; Wutiwiwatchai, C. ; Thangthai, A.
Author_Institution
Human Language Technol. Lab., Nat. Electron. & Comput. Technol. Center (NECTEC), Bangkok
Volume
1
fYear
2008
fDate
14-17 May 2008
Firstpage
549
Lastpage
552
Abstract
This paper presents the naturalness improvement in Thai unit-selection text-to-speech synthesis (TTS) by automatic weighting of targeted cost. An intuition of the proposed method is that the sensitivity of human perception might be varied to different phonemic and prosodic units. In this work, the unit-selection targeted-cost of each phoneme unit is weighted differently according to its duration statistic and voicing characteristic. Two automatic weighting algorithms, based on the statistical mean and standard deviation of phoneme duration, are comparatively evaluated. A subjective test shows a 0.46 mean-opinion-score improvement over the baseline speech synthesized without targeted-cost weighting.
Keywords
natural language processing; speech processing; speech synthesis; Thai unit-selection speech synthesis; automatic weighting algorithms; human perception; naturalness improvement; phoneme unit; text-to-speech synthesis; voicing characteristic; Cost function; Humans; Laboratories; Natural languages; Paper technology; Predictive models; Speech synthesis; Statistics; Tagging; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, 2008. ECTI-CON 2008. 5th International Conference on
Conference_Location
Krabi
Print_ISBN
978-1-4244-2101-5
Electronic_ISBN
978-1-4244-2102-2
Type
conf
DOI
10.1109/ECTICON.2008.4600492
Filename
4600492
Link To Document