Automatic prosodic modeling for speaker and task adaptation in text-to-speech

Author

López-Gonzalo, Eduardo ; Rodríguez-García, Jose M. ; Hernández-Gómez, Luis ; Villar, Juan M.

Author_Institution

ETSI Telecomunicacion, Univ. Politecnica de Madrid, Spain

Volume

2

fYear

1997

fDate

21-24 Apr 1997

Firstpage

927

Abstract

One of the most important demands for future text-to-speech (TTS) systems is their ability to improve naturalness when embedded in a particular task or application that requires a particular speaking style for a particular speaker. We present a new prosodic modeling procedure for improving naturalness by adapting a TTS system to a new speaker and a new speaking style. The proposed procedure is an extension of our automatic data-driven methodology, to model both fundamental frequency and segmental duration. Automatic linguistic and acoustic analysis are performed on both a task dependent text corpus and the recorded material from the selected speaker

Keywords

acoustic signal processing; linguistics; speech processing; speech synthesis; automatic acoustic analysis; automatic data driven method; automatic linguistic analysis; automatic prosodic modeling; fundamental frequency; naturalness; recorded material; segmental duration; speaker adaptation; speaking style; speech synthesizer; task adaptation; task dependent text corpus; text to speech systems; Acoustic materials; Electronic mail; Feature extraction; Frequency estimation; Loudspeakers; Pattern analysis; Performance analysis; Speech analysis; Speech synthesis; Telecommunications;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location

Munich

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.596088

Filename

596088