DocumentCode
998596
Title
Towards Signal-Based Instrumental Quality Diagnosis for Text-to-Speech Systems
Author
Falk, Tiago H. ; Möller, Sebastian
Author_Institution
Dept. of Electr. & Comput. Eng., Queen´´s Univ., Kingston, ON
Volume
15
fYear
2008
fDate
6/30/1905 12:00:00 AM
Firstpage
781
Lastpage
784
Abstract
In this letter, the first steps toward the development of a signal-based instrumental quality measure for text-to-speech (TTS) systems are described. Hidden Markov models (HMM), trained on naturally-produced speech, serve as artificial text- and speaker-independent reference models against which synthesized speech signals are assessed. A normalized log-likelihood measure, computed between perceptual features extracted from synthesized speech and a gender-dependent HMM reference model, is proposed and shown to be a reliable parameter for multidimensional TTS quality diagnosis. Experiments with subjectively scored synthesized speech data show that the proposed measure attains promising estimation performance for quality dimensions labeled overall impression, listening effort, naturalness, continuity/fluency, and acceptance.
Keywords
feature extraction; hidden Markov models; speech synthesis; hidden Markov models; log-likelihood measure; signal-based instrumental quality diagnosis; speaker-independent reference; speech signals; teat-independent reference; text-to-speech systems; Data mining; Feature extraction; Hidden Markov models; Instruments; Multidimensional systems; Natural languages; Signal processing algorithms; Signal synthesis; Speech synthesis; Testing; Hidden Markov model; multidimensional quality diagnosis; quality prediction; synthesized speech; text-to-speech (TTS);
fLanguage
English
Journal_Title
Signal Processing Letters, IEEE
Publisher
ieee
ISSN
1070-9908
Type
jour
DOI
10.1109/LSP.2008.2006709
Filename
4682563
Link To Document