Title :
Statistical methods in data-driven modeling of Spanish prosody for text to speech
Author :
Lopez-Gonzalo, E. ; Rodríguez-Garcia, J.M.
Author_Institution :
ETSI Telecomunicacion, Univ. Politecnica de Madrid, Spain
Abstract :
In (Lopez-Gonzalo et al., 1995), we proposed an automatic data-driven methodology to model both fundamental frequency and segmental duration in TTS converters from a monospeaker recorded corpus. Therefore, it had the advantage that it could be adapted to a specific corpus or a particular speaker. The main disadvantage was the size of the obtained prosodic database. In this paper, we propose to use some statistical methods for reducing the prosodic database required in this methodology. A 50% reduction can be obtained without compromising the naturalness of the synthetic speech obtained by our previous methodology with the same prosodic corpus. A compromise between variability and reduction in prosodic contours is also discussed
Keywords :
database management systems; natural language interfaces; speech synthesis; statistical analysis; Spanish prosody; TTS converters; data-driven modeling; fundamental frequency; methodology; monospeaker recorded corpus; prosodic contours; prosodic database; segmental duration; statistical methods; synthetic speech; text to speech synthesis; Contracts; Electronic mail; Feature extraction; Frequency conversion; Natural languages; Spatial databases; Speech recognition; Speech synthesis; Statistical analysis; Telecommunications;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607870