مرکز منطقه ای اطلاع رساني علوم و فناوري - Methods for applying dynamic sinusoidal models to statistical parametric speech synthesis

DocumentCode :

3431237

Title :

Methods for applying dynamic sinusoidal models to statistical parametric speech synthesis

Author :

Qiong Hu ; Stylianou, Yannis ; Maia, Ranniery ; Richmond, Korin ; Yamagishi, Junichi

Author_Institution :

Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4889

Lastpage :

4893

Abstract :

Sinusoidal vocoders can generate high quality speech, but they have not been extensively applied to statistical parametric speech synthesis. This paper presents two ways for using dynamic sinusoidal models for statistical speech synthesis, enabling the sinusoid parameters to be modelled in HMM-based synthesis. In the first method, features extracted from a fixed- and low-dimensional, perception-based dynamic sinusoidal model (PDM) are statistically modelled directly. In the second method, we convert both static amplitude and dynamic slope from all the harmonics of a signal, which we term the Harmonic Dynamic Model (HDM), to intermediate parameters (regularised cepstral coefficients) for modelling. During synthesis, HDM is then used to reconstruct speech. We have compared the voice quality of these two methods to the STRAIGHT cepstrum-based vocoder with mixed excitation in formal listening tests. Our results show that HDM with intermediate parameters can generate comparable quality as STRAIGHT, while PDM direct modelling seems promising in terms of producing good speech quality without resorting to intermediate parameters such as cepstra.

Keywords :

feature extraction; hidden Markov models; signal reconstruction; speech coding; speech synthesis; statistical analysis; vocoders; HDM; HMM-based synthesis; PDM; feature extraction; formal listening test; harmonic dynamic model; perception-based dynamic sinusoidal model; sinusoid parameter modelling; sinusoidal vocoder; speech quality; speech reconstruction; statistical parametric speech synthesis; Adaptation models; Harmonic analysis; Hidden Markov models; High-temperature superconductors; Speech; Speech synthesis; Vocoders; Discrete cepstra; Parametric statistical speech synthesis; Quality; Sinusoidal model;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178900

Filename :

7178900

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3431237