Title :
Improving High Quality TTS using Circular Linear Prediction and Constant Pitch Transform
Author :
Shukla, Satyavati ; Barnwell, T.P.
Author_Institution :
Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
Current high quality concatenative TTS systems are based on unit selection from a database that is contextually and prosodically rich. These systems are computationally expensive and require a very large footprint. This paper presents a new method for representing speech segments that can improve the quality and scalability of concatenative TTS systems. The circular linear prediction model combined with the constant pitch transform provides a robust representation of speech signals that allows for limited prosodic movements without perceivable loss in quality. A method is presented for constraining the LSF tracks of speech segments to realize pitch modifications with minimal artifacts. The results of formal listening tests demonstrate that limited prosodic modifications can produce speech from fewer units whose quality equals or exceeds large database unit-selection systems. Additionally, this method is used to realize high quality emphasized speech.
Keywords :
speech processing; speech synthesis; transforms; LSF tracks; TTS; circular linear prediction; constant pitch transform; large database unit-selection systems; prosodic modifications; robust representation; speech segments; speech signals; text-to-speech synthesis; Data engineering; Databases; Predictive models; Robustness; Scalability; Speech coding; Speech processing; Speech synthesis; System testing; Vocabulary; Speech synthesis; linear predictive coding; speech communication; speech intelligibility; speech processing;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0727-3
DOI :
10.1109/ICASSP.2007.367004