A text-to-audiovisual-speech synthesizer for French

Author

Le Goff, Bertrand ; Benoît, Christian

Author_Institution

Inst. de la Commun. Parlee, Stendhal Univ., Grenoble, France

Volume

4

fYear

1996

fDate

3-6 Oct 1996

Firstpage

2163

Abstract

An audiovisual speech synthesizer from unlimited French text is presented. It uses a 3-D parametric model of the face. The facial model is controlled by eight parameters. Target values have been assigned to the parameters, for each French viseme, based upon measurements made on a human speaker. Parameter trajectories are modeled by means of dominance functions associated with each parameter and each viseme. A dominance function is characterized by three coefficients so that coarticulation finally depends on the phonetic context, the speech rate, and an “hypo-hyper articulation” coefficient adjustable by the user. Finally, the visual and audiovisual intelligibility of the visual synthesizer has been evaluated in its first version, and compared to that of the acoustic synthesizer on which it was implemented

Keywords

audio-visual systems; speech intelligibility; speech synthesis; 3D parametric face model; French viseme; acoustic synthesizer; audiovisual intelligibility; coarticulation; dominance functions; human speaker measurements; hypo-hyper articulation coefficient; parameter trajectories; phonetic context; speech rate; target values; text-to-audiovisual-speech synthesizer; unlimited French text; visual intelligibility; Communication system control; Degradation; Face; Facial animation; Humans; Loudspeakers; Parametric statistics; Speech enhancement; Speech synthesis; Synthesizers;

fLanguage

English

Publisher

ieee

Conference_Titel

Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on

Conference_Location

Philadelphia, PA

Print_ISBN

0-7803-3555-4

Type

conf

DOI

10.1109/ICSLP.1996.607232

Filename

607232