• DocumentCode
    1932278
  • Title

    A Unified Totally-Data-Driven Prediction of Duration and Pause in TTS

  • Author

    Hao, Jie ; Yi, Lifu ; Li, Jian ; Lou, Xiaoyan

  • Author_Institution
    Toshiba Res. & Dev. Center
  • Volume
    1
  • fYear
    2006
  • fDate
    16-20 2006
  • Abstract
    This paper proposes a unified framework for duration and pause prediction in TTS. The framework is based on plain generalized linear model (GLM) under the assumption of Gaussian distribution for duration and logistic GLM under the assumption of Bernoulli distribution for pause. Significant attributes and attribute interactions can be automatically selected to guarantee a good balance between model reliability and goodness of fit. Both the linear attributes and the nonlinear attribute interactions are selected in a totally-data-driven manner rather than intuitively. Speaking rate is introduced as a new attribute, which increases the prediction precision and provides a novel approach to adjust speaking rate when synthesizing as well. GLM is trained by stepwise regression under F-test and Bayes information criterion (BIC). Open test experiments shows that the proposed GLM approach outperforms the decision tree based approach. In addition, the GLM approach is compact and suitable for embedded TTS
  • Keywords
    Bayes methods; Gaussian distribution; decision trees; speech synthesis; Bayes information criterion; Bernoulli distribution; Gaussian distribution; TTS; decision tree based approach; generalized linear model; nonlinear attribute interactions; text to speech system; unified totally-data-driven prediction; Decision trees; Gaussian distribution; Hidden Markov models; Logistics; Predictive models; Research and development; Speech synthesis; Statistical distributions; Support vector machines; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing, 2006 8th International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7803-9736-3
  • Electronic_ISBN
    0-7803-9736-3
  • Type

    conf

  • DOI
    10.1109/ICOSP.2006.345530
  • Filename
    4128945