• DocumentCode
    417231
  • Title

    Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis

  • Author

    Toda, Tomoki ; Kawai, Hiroyuki ; Tsuzaki, Masanori

  • Author_Institution
    Graduate Sch. of Eng., Nagoya Inst. of Technol., Aichi, Japan
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    In concatenative speech synthesis, various factors affect the naturalness of synthetic speech. A cost for segment selection is calculated by integrating some sub-costs capturing the degradation of naturalness caused by such factors. In this paper, we optimize each sub-cost function for converting a linguistic feature or an acoustic parameter into a sub-cost based on perceptual evaluations. Two types of perceptual experiments are performed with test sets constructed by controlling the variations of sub-costs to evaluate the independent effect of each sub-cost and the interactions between them. We clarify the effectiveness of perceptually optimizing subcost functions from a result of a preference test comparing synthetic speech before and after the optimization.
  • Keywords
    Pareto optimisation; feature extraction; parameter estimation; speech synthesis; acoustic parameter; concatenative speech synthesis; linguistic feature; optimization; perceptual evaluations; segment selection; speech naturalness; sub-cost functions; Acoustic measurements; Acoustic testing; Cost function; Degradation; Laboratories; Natural languages; Performance evaluation; Speech analysis; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326071
  • Filename
    1326071