• DocumentCode
    998596
  • Title

    Towards Signal-Based Instrumental Quality Diagnosis for Text-to-Speech Systems

  • Author

    Falk, Tiago H. ; Möller, Sebastian

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Queen´´s Univ., Kingston, ON
  • Volume
    15
  • fYear
    2008
  • fDate
    6/30/1905 12:00:00 AM
  • Firstpage
    781
  • Lastpage
    784
  • Abstract
    In this letter, the first steps toward the development of a signal-based instrumental quality measure for text-to-speech (TTS) systems are described. Hidden Markov models (HMM), trained on naturally-produced speech, serve as artificial text- and speaker-independent reference models against which synthesized speech signals are assessed. A normalized log-likelihood measure, computed between perceptual features extracted from synthesized speech and a gender-dependent HMM reference model, is proposed and shown to be a reliable parameter for multidimensional TTS quality diagnosis. Experiments with subjectively scored synthesized speech data show that the proposed measure attains promising estimation performance for quality dimensions labeled overall impression, listening effort, naturalness, continuity/fluency, and acceptance.
  • Keywords
    feature extraction; hidden Markov models; speech synthesis; hidden Markov models; log-likelihood measure; signal-based instrumental quality diagnosis; speaker-independent reference; speech signals; teat-independent reference; text-to-speech systems; Data mining; Feature extraction; Hidden Markov models; Instruments; Multidimensional systems; Natural languages; Signal processing algorithms; Signal synthesis; Speech synthesis; Testing; Hidden Markov model; multidimensional quality diagnosis; quality prediction; synthesized speech; text-to-speech (TTS);
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2008.2006709
  • Filename
    4682563