Automatic generation of prosodic rules for speech synthesis

Author

Yamashita, Yoichi ; Miroguchi, R.

Author_Institution

Inst. of Sci. & Ind. Res., Osaka Univ., Japan

Volume

i

fYear

1994

fDate

19-22 Apr 1994

Abstract

The paper describes automatic generation of speech synthesis rules which predict the accent component value (stress level) for the bunsetsu in long noun phrases. The rules are inductively inferred from a lot of speech data by using two kinds of tree-based methods, the conventional tree generation and the SBR-tree algorithm. The rule sets automatically generated by the two methods have the almost same performance and decrease the prediction error to about 14 Hz from 23 Hz of the accent component value. The rate of the correct reproduction of the change, that is increase or decrease, for adjacent bunsetsu pairs is also used as a measure of evaluation and the generated rule sets correctly reproduce about 80% of the change. Effectiveness of the rule sets is verified through a listening test. SBR-tree methods generate very compact rules which are easy for human experts to interpret and match with the former studies

Keywords

learning (artificial intelligence); natural languages; speech synthesis; tree data structures; Japanese; SBR-tree algorithm; accent component value; adjacent bunsetsu pairs; automatic generation; bunsetsu; correct reproduction; long noun phrases; performance; prediction error; prosodic rules; rule set; single best rule; speech synthesis; stress level; tree generation algorithm; tree-based methods; Acoustic testing; Decision trees; Frequency; Humans; Neural networks; Speech synthesis; Stochastic processes; Stress;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on

Conference_Location

Adelaide, SA

ISSN

1520-6149

Print_ISBN

0-7803-1775-0

Type

conf

DOI

10.1109/ICASSP.1994.389224

Filename

389224