DocumentCode
1416852
Title
Applying the harmonic plus noise model in concatenative speech synthesis
Author
Stylianou, Yannis
Author_Institution
Shannon Labs., AT&T Labs.-Res., Florham Park, NJ, USA
Volume
9
Issue
1
fYear
2001
fDate
1/1/2001 12:00:00 AM
Firstpage
21
Lastpage
29
Abstract
This paper describes the application of the harmonic plus noise model (HNM) for concatenative text-to-speech (TTS) synthesis. In the context of HNM, speech signals are represented as a time-varying harmonic component plus a modulated noise component. The decomposition of a speech signal into these two components allows for more natural-sounding modifications of the signal (e.g., by using different and better adapted schemes to modify each component). The parametric representation of speech using HNM provides a straightforward way of smoothing discontinuities of acoustic units around concatenation points. Formal listening tests have shown that HNM provides high-quality speech synthesis while outperforming other models for synthesis (e.g., TD-PSOLA) in intelligibility, naturalness, and pleasantness
Keywords
acoustic signal processing; harmonics; noise; signal representation; smoothing methods; speech intelligibility; speech synthesis; acoustic units; adapted schemes; concatenative text-to-speech synthesis; discontinuities smoothing; formal listening tests; harmonic plus noise model; high-quality speech synthesis; modulated noise component; natural-sounding signal modifications; parametric speech representation; speech intelligibility; speech naturalness; speech pleasantness; speech signal decomposition; speech signals representation; time-varying harmonic component; Acoustic noise; Context modeling; Degradation; Filters; Linear predictive coding; Phase estimation; Signal synthesis; Speech processing; Speech synthesis; Transaction databases;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.890068
Filename
890068
Link To Document