DocumentCode
1460245
Title
Instrumental Assessment of Prosodic Quality for Text-to-Speech Signals
Author
Norrenbrock, Christoph R. ; Hinterleitner, Florian ; Heute, Ulrich ; Möller, Sebastian
Author_Institution
Digital Signal Process. & Syst. Theor. Group (DSS), Univ. of Kiel, Kiel, Germany
Volume
19
Issue
5
fYear
2012
fDate
5/1/2012 12:00:00 AM
Firstpage
255
Lastpage
258
Abstract
Formal parameters of speech prosody are investigated concerning their ability to estimate the perceptual quality of text-to-speech (TTS) signals. The study is carried out for the German language using a broad databasis comprising a wide range of TTS systems and text materials. 18 purely acoustic markers, derived from Fo and vocalic/consonantal durations, are analysed individually and in conjunction via cross-validated regression models. The Fo slope within voiced segments proves particularly useful when integrated in a nonlinear fashion, whereas measures of durational variation perform comparably weak. The results highlight a strong potential for instrumental estimation techniques of TTS quality.
Keywords
acoustic signal processing; estimation theory; natural language processing; regression analysis; speech synthesis; German language; TTS signals; TTS systems; acoustic markers; broad databases; consonantal durations; cross-validated regression models; durational variation; formal parameters; instrumental assessment; instrumental estimation techniques; nonlinear fashion; perceptual quality estimation; prosodic quality; speech prosody; text materials; text-to-speech signals; vocalic durations; voiced segments; Correlation; Databases; Instruments; Materials; Rhythm; Speech; Timing; Instrumental quality assessment; prosody; speech quality; text-to-speech (TTS);
fLanguage
English
Journal_Title
Signal Processing Letters, IEEE
Publisher
ieee
ISSN
1070-9908
Type
jour
DOI
10.1109/LSP.2012.2189562
Filename
6161607
Link To Document