• DocumentCode
    394335
  • Title

    Segment selection considering local degradation of naturalness in concatenative speech synthesis

  • Author

    Toda, Tomoki ; Kawai, Hisashi ; Tsuzaki, Masanori ; Shikano, Kiyohiro

  • Author_Institution
    ATR Spoken Language Translation Res. Labs., Kyoto, Japan
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    In this paper, we investigate the effect of using a novel cost, RMS (root mean square) cost, for segment selection for concatenative text-to-speech synthesis. The RMS cost is affected not only by the total degradation of naturalness but also by the local degradation of naturalness. From the results of experiments comparing this approach with segment selection based on a conventional average cost, it is found that: (1) in the segment selection based on the RMS cost a larger number of concatenations causing slight local degradation are performed in order to avoid concatenations causing greater local degradation; and (2) the effect of the RMS cost has little dependence on the size of the corpus. Moreover, we clarify that the naturalness of synthetic speech can be slightly improved by utilizing the RMS cost.
  • Keywords
    speech processing; speech synthesis; RMS cost; concatenative speech synthesis; local naturalness degradation; root mean square cost; segment selection; text-to-speech synthesis; Cities and towns; Cost function; Degradation; Information science; Laboratories; Natural languages; Root mean square; Speech synthesis; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198876
  • Filename
    1198876