CART-based modeling of Chinese tonal patterns with a functional model tracing the fundamental frequency trajectories

Author

Ni, Jinfu ; Sakai, Shinsuke ; Shimizu, Tohru ; Nakamura, Satoshi

Author_Institution

Nat. Inst. of Inf. & Commun. Technol.

fYear

2009

fDate

19-24 April 2009

Firstpage

4253

Lastpage

4256

Abstract

We propose an approach to modeling Chinese tonal patterns, focusing on the basic fundamental frequency (F₀) patterns characterized by the contextual linguistic features that can be directly extracted from text. We analyze tonal patterns as sparse target points (tonal F₀ peaks and valleys) and represent them in parametric form within the framework of a functional F₀ model. The relationships between the target points and underlying linguistic features are trained using classification and regression tree analysis (CARTs), and this functional model is used to trace the F₀ trajectories when training the CARTs and to synthesize a tonal pattern from the target points predicted by the CARTs. Our experiments indicate that the proposed method has low F₀ prediction errors. Utilization of the F₀ ranges measured from training samples could significantly reduce the influences of differences in voice ranges on training a speaker-independent model. Furthermore, the most important roles in characterizing tonal patterns were played by a few linguistic features such as lexical tone context and the distinction between voiced from unvoiced initials.

Keywords

learning (artificial intelligence); speech processing; speech synthesis; Chinese tonal patterns; Prosody modeling; cart-based modeling; contextual linguistic features; functional model tracing; fundamental frequency trajectories; machine learning; regression tree analysis; speaker-independent model; speech processing; speech synthesis; Classification tree analysis; Context modeling; Data mining; Frequency; Hidden Markov models; Natural languages; Pattern analysis; Predictive models; Regression tree analysis; Speech synthesis; Prosody modeling; functional F0 model; machine learning; speech processing; speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location

Taipei

ISSN

1520-6149

Print_ISBN

978-1-4244-2353-8

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2009.4960568

Filename

4960568