A novel pitch decomposition method for the generalized linear alignment model

Author

Langarani, Mahsa Sadat Elyasi ; Klabbers, Esther ; van Santen, Jan

Author_Institution

Center for Spoken Language Understanding, Oregon Health & Sci. Univ., Portland, OR, USA

fYear

2014

fDate

4-9 May 2014

Firstpage

2584

Lastpage

2588

Abstract

Superpositional models of intonation typically propose decomposing fundamental frequency (F₀) contours into phrase curves and accent curves, aligned with phrases and left-headed feet, respectively. Extracting these component curves from F₀ contours without making undue assumptions is challenging. We propose a novel method for decomposing pitch curves, based on the assumption that accent curves can be described by combining skewed normal distributions and sigmoid functions. In contrast to an earlier pitch decomposition algorithm (“PRISM”), this allows for simple joint optimization of phrase and accent curve parameters, using fewer parameters. The proposed method was evaluated on three speech corpora containing: (1) synthetically generated pitch curves, (2) all-sonorant utterances, and (3) utterances containing both sonorant and non-sonorant speech sounds. The root weighted mean squared error is small, and, on the corpus for which comparable data are available, is significantly smaller than for PRISM.

Keywords

mean square error methods; speech synthesis; text analysis; accent curves; all-sonorant utterances; component curves extraction; fundamental frequency contours; generalized linear alignment model; intonation; joint optimization; left-headed feet; nonsonorant speech sounds; phrase curves; pitch curves decomposition; root weighted mean squared error; sigmoid functions; skewed normal distributions; speech corpora; superpositional models; synthetically generated pitch curves; Conferences; Equations; Foot; Mathematical model; Protocols; Speech; Speech synthesis; prosody modeling; superpositional model; text-to-speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854067

Filename

6854067