Automatic estimation of accentual attribute values of words to realize accent sandhi in Japanese text-to-speech conversion

Author

Minematsu, Nobuaki ; Kita, Ryuji ; Hirose, Keikichi

Author_Institution

Graduate Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Japan

fYear

2002

fDate

11-13 Sept. 2002

Firstpage

107

Lastpage

110

Abstract

Accurate estimation of accentual attribute values of words, which is required to apply rules of Japanese word accent sandhi to prosody generation, is an important factor to realize high-quality text-to-speech (TTS) conversion. The rules have already been formulated by Sagisaka et al. (1984) and are widely used in Japanese TTS converters. Application of these rules, however, requires values of a few accentual attributes of each constituent word. In this paper, these values were estimated through a long series of listening experiments. Here, collection of data of accent types of accentual phrases and estimation of the attribute values from the phrase data were done, where inter-speaker differences of knowledge of the accent sandhi were well considered. The rules were further modified to improve the coverage over the obtained data. Evaluation experiments showed the high validity of the estimated values and the modified rules.

Keywords

parameter estimation; speech processing; speech synthesis; Japanese text-to-speech conversion; TTS conversion; accentual attribute values; accentual phrases; automatic estimation; high-quality text-to-speech conversion; inter-speaker differences; prosody generation; sandhi accent; words; Dictionaries; Distributed databases; Frequency; Information science; Natural languages; Research and development; Speech coding; Speech recognition; Speech synthesis; System performance;

fLanguage

English

Publisher

ieee

Conference_Titel

Speech Synthesis, 2002. Proceedings of 2002 IEEE Workshop on

Print_ISBN

0-7803-7395-2

Type

conf

DOI

10.1109/WSS.2002.1224383

Filename

1224383