Title :
Perceptual evaluation of dynamic cost weighting for unit selection TTS
Author :
Bellegarda, Jerome R.
Author_Institution :
Speech & Language Technol., Apple Inc., Cupertino, CA, USA
Abstract :
Unit selection text-to-speech synthesis relies on multiple cost criteria, each encapsulating a different aspect of acoustic and prosodic context at any given concatenation point. For a particular set of criteria, the relative weighting of the resulting costs crucially affects final candidate ranking. We have recently advocated a new weighting strategy based on a data-driven framework separately optimized for each concatenation. In this approach, the cost distribution in every information stream is dynamically leveraged to locally shift weight towards those characteristics that prove most discriminative at this point. To further validate this procedure, this paper presents formal listening evidence suggesting that dynamic cost weighting indeed entails higher perceived TTS quality.
Keywords :
optimisation; speech synthesis; acoustic context; concatenation point; cost distribution; dynamic cost weighting; final candidate ranking; formal listening evidence; information stream; multiple cost criteria; optimization; perceptual evaluation; prosodic context; unit selection TTS; unit selection text-to-speech synthesis; Cost function; Diversity reception; Extrapolation; Humans; Information analysis; Natural languages; Optimization methods; Pathology; Speech analysis; Speech synthesis; candidate ranking; concatenative speech synthesis; cost weighting; unit selection;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495152