Speech synthesis for a specific speaker based on a labeled speech database

Author

Hoory, R. ; Chazan, D.

Author_Institution

Dept. of Electr. Eng., Technion-Israel Inst. of Technol., Haifa, Israel

fYear

1994

fDate

9-13 Oct 1994

Firstpage

146

Abstract

This paper proposes a new text-to-speech synthesis technique, for producing continuous, natural sounding speech of a specific speaker. The synthesis technique is based on selecting short speech frames from a phoneme-labeled speech database. The selection procedure involves minimization of a distortion criterion, by a dynamic programming algorithm. The proposed scheme is more flexible than many existing schemes using fixed speech segments, such as diphones. It results in a more natural synthesized speech. An efficient speech representation is used to express simply and accurately the spectral continuity of speech. A further improvement in the database search mechanism and in database size was obtained by sectioning the speech phonemes into “steady-states” and “transitions”. The resulting synthesized speech quality, is satisfactory and preserves the natural voice of the speaker

Keywords

speech synthesis; database search mechanism; distortion criterion; dynamic programming; labeled speech database; minimization; spectral continuity; speech phoneme sectioning; speech representation; text to speech synthesis; Assembly; Databases; Dynamic programming; Heuristic algorithms; Loudspeakers; Minimization methods; Natural languages; Speech analysis; Speech synthesis; Stability;

fLanguage

English

Publisher

ieee

Conference_Titel

Pattern Recognition, 1994. Vol. 3 - Conference C: Signal Processing, Proceedings of the 12th IAPR International Conference on

Conference_Location

Jerusalem

Print_ISBN

0-8186-6275-1

Type

conf

DOI

10.1109/ICPR.1994.577142

Filename

577142