Title :
Towards Signal-Based Instrumental Quality Diagnosis for Text-to-Speech Systems
Author :
Falk, Tiago H. ; Möller, Sebastian
Author_Institution :
Dept. of Electr. & Comput. Eng., Queen´´s Univ., Kingston, ON
fDate :
6/30/1905 12:00:00 AM
Abstract :
In this letter, the first steps toward the development of a signal-based instrumental quality measure for text-to-speech (TTS) systems are described. Hidden Markov models (HMM), trained on naturally-produced speech, serve as artificial text- and speaker-independent reference models against which synthesized speech signals are assessed. A normalized log-likelihood measure, computed between perceptual features extracted from synthesized speech and a gender-dependent HMM reference model, is proposed and shown to be a reliable parameter for multidimensional TTS quality diagnosis. Experiments with subjectively scored synthesized speech data show that the proposed measure attains promising estimation performance for quality dimensions labeled overall impression, listening effort, naturalness, continuity/fluency, and acceptance.
Keywords :
feature extraction; hidden Markov models; speech synthesis; hidden Markov models; log-likelihood measure; signal-based instrumental quality diagnosis; speaker-independent reference; speech signals; teat-independent reference; text-to-speech systems; Data mining; Feature extraction; Hidden Markov models; Instruments; Multidimensional systems; Natural languages; Signal processing algorithms; Signal synthesis; Speech synthesis; Testing; Hidden Markov model; multidimensional quality diagnosis; quality prediction; synthesized speech; text-to-speech (TTS);
Journal_Title :
Signal Processing Letters, IEEE
DOI :
10.1109/LSP.2008.2006709