DocumentCode
394335
Title
Segment selection considering local degradation of naturalness in concatenative speech synthesis
Author
Toda, Tomoki ; Kawai, Hisashi ; Tsuzaki, Masanori ; Shikano, Kiyohiro
Author_Institution
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
Volume
1
fYear
2003
fDate
6-10 April 2003
Abstract
In this paper, we investigate the effect of using a novel cost, RMS (root mean square) cost, for segment selection for concatenative text-to-speech synthesis. The RMS cost is affected not only by the total degradation of naturalness but also by the local degradation of naturalness. From the results of experiments comparing this approach with segment selection based on a conventional average cost, it is found that: (1) in the segment selection based on the RMS cost a larger number of concatenations causing slight local degradation are performed in order to avoid concatenations causing greater local degradation; and (2) the effect of the RMS cost has little dependence on the size of the corpus. Moreover, we clarify that the naturalness of synthetic speech can be slightly improved by utilizing the RMS cost.
Keywords
speech processing; speech synthesis; RMS cost; concatenative speech synthesis; local naturalness degradation; root mean square cost; segment selection; text-to-speech synthesis; Cities and towns; Cost function; Degradation; Information science; Laboratories; Natural languages; Root mean square; Speech synthesis; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-7663-3
Type
conf
DOI
10.1109/ICASSP.2003.1198876
Filename
1198876
Link To Document