DocumentCode
67931
Title
Autoregressive Models for Statistical Parametric Speech Synthesis
Author
Shannon, Matt ; Heiga Zen ; Byrne, William
Author_Institution
Dept. of Eng., Univ. of Cambridge, Cambridge, UK
Volume
21
Issue
3
fYear
2013
fDate
Mar-13
Firstpage
587
Lastpage
597
Abstract
We propose using the autoregressive hidden Markov model (HMM) for speech synthesis. The autoregressive HMM uses the same model for parameter estimation and synthesis in a consistent way, in contrast to the standard approach to statistical parametric speech synthesis. It supports easy and efficient parameter estimation using expectation maximization, in contrast to the trajectory HMM. At the same time its similarities to the standard approach allow use of established high quality synthesis algorithms such as speech parameter generation considering global variance. The autoregressive HMM also supports a speech parameter generation algorithm not available for the standard approach or the trajectory HMM and which has particular advantages in the domain of real-time, low latency synthesis. We show how to do efficient parameter estimation and synthesis with the autoregressive HMM and look at some of the similarities and differences between the standard approach, the trajectory HMM and the autoregressive HMM. We compare the three approaches in subjective and objective evaluations. We also systematically investigate which choices of parameters such as autoregressive order and number of states are optimal for the autoregressive HMM.
Keywords
hidden Markov models; speech synthesis; statistical analysis; autoregressive HMM model; autoregressive hidden Markov model; autoregressive order; expectation maximization; global variance; high quality synthesis algorithms; parameter estimation; speech parameter generation algorithm; statistical parametric speech synthesis; Acoustics; Hidden Markov models; Parameter estimation; Speech; Speech synthesis; Standards; Vectors; Acoustic modeling; autoregressive hidden Markov model; autoregressive processes; hidden Markov models (HMMs); speech; statistical parametric speech synthesis;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2012.2227740
Filename
6353548
Link To Document