DocumentCode :
1460245
Title :
Instrumental Assessment of Prosodic Quality for Text-to-Speech Signals
Author :
Norrenbrock, Christoph R. ; Hinterleitner, Florian ; Heute, Ulrich ; Möller, Sebastian
Author_Institution :
Digital Signal Process. & Syst. Theor. Group (DSS), Univ. of Kiel, Kiel, Germany
Volume :
19
Issue :
5
fYear :
2012
fDate :
5/1/2012 12:00:00 AM
Firstpage :
255
Lastpage :
258
Abstract :
Formal parameters of speech prosody are investigated concerning their ability to estimate the perceptual quality of text-to-speech (TTS) signals. The study is carried out for the German language using a broad databasis comprising a wide range of TTS systems and text materials. 18 purely acoustic markers, derived from Fo and vocalic/consonantal durations, are analysed individually and in conjunction via cross-validated regression models. The Fo slope within voiced segments proves particularly useful when integrated in a nonlinear fashion, whereas measures of durational variation perform comparably weak. The results highlight a strong potential for instrumental estimation techniques of TTS quality.
Keywords :
acoustic signal processing; estimation theory; natural language processing; regression analysis; speech synthesis; German language; TTS signals; TTS systems; acoustic markers; broad databases; consonantal durations; cross-validated regression models; durational variation; formal parameters; instrumental assessment; instrumental estimation techniques; nonlinear fashion; perceptual quality estimation; prosodic quality; speech prosody; text materials; text-to-speech signals; vocalic durations; voiced segments; Correlation; Databases; Instruments; Materials; Rhythm; Speech; Timing; Instrumental quality assessment; prosody; speech quality; text-to-speech (TTS);
fLanguage :
English
Journal_Title :
Signal Processing Letters, IEEE
Publisher :
ieee
ISSN :
1070-9908
Type :
jour
DOI :
10.1109/LSP.2012.2189562
Filename :
6161607
Link To Document :
بازگشت