Building a speech database for the purpose of speaker specific speech synthesis

Author

Hoory, R. ; Shaked, N. ; Chazan, D.

Author_Institution

IBM Israel Sci. & Technol. Center, Haifa, Israel

Volume

1

fYear

1996

fDate

14-18 Oct 1996

Firstpage

741

Abstract

This paper presents practical and theoretical work carried out at IBM Research Laboratory, during the course of a speech synthesis project. The paper deals with two separate issues. The first is the generation of a compact set of English utterances that will attain a good phonetic coverage of the language. The second issue is constructing a speaker specific database. This starts with the recording of the speaker´s speech, modeling it using a highly efficient speech representation and segmenting it into phonemes. The phoneme segmentation process is performed semi-automatically, using an iterative algorithm. A customized software named SPED was developed in order to simplify and speed up the segmentation process and at the same time improve its accuracy. The objective of the methodology presented is to generate new “voice fonts” for text to speech systems

Keywords

database management systems; iterative methods; natural languages; speech processing; speech recognition; speech synthesis; English utterances generation; IBM Research Laboratory; SPED customized software; accuracy; iterative algorithm; phoneme segmentation; phonetic coverage; speaker specific database; speaker specific speech synthesis; speech database; speech modeling; speech recognition; speech recording; speech representation; speech segmentation; speech synthesis project; text to speech systems; voice fonts; Concatenated codes; Databases; Graphics; Laboratories; Paper technology; Speech recognition; Speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing, 1996., 3rd International Conference on

Conference_Location

Beijing

Print_ISBN

0-7803-2912-0

Type

conf

DOI

10.1109/ICSIGP.1996.567369

Filename

567369