DocumentCode
3340902
Title
TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
Author
Syrdal, Ann ; Stylianou, Yannis ; Garrison, Laurie ; Conkie, Alistair ; Schroeter, Juergen
Author_Institution
Res. Labs., AT&T Labs., Florham Park, NJ, USA
Volume
1
fYear
1998
fDate
12-15 May 1998
Firstpage
273
Abstract
In an effort to select a speech representation for our next generation concatenative text-to-speech synthesizer, the use of two candidates is investigated; TD-PSOLA and the harmonic plus noise model, HNM. A formal listening test has been conducted and the two candidates have been rated regarding intelligibility, naturalness and pleasantness. Ability for database compression and computational load is also discussed. The results show that HNM consistently outperforms TD-PSOLA in all the above features except for computational load. HNM allows for high-quality speech synthesis without smoothing problems at the segmental boundaries and without buzziness or other oddities observed with TD-PSOLA
Keywords
acoustic noise; speech intelligibility; speech synthesis; HNM; TD-PSOLA; buzziness; computational load; database compression; diphone based speech synthesis; formal listening test; harmonic plus noise model; high-quality speech synthesis; intelligibility; naturalness; next generation concatenative text-to-speech synthesizer; pleasantness; segmental boundaries; speech representation; Acoustic noise; Linear predictive coding; Man machine systems; Smoothing methods; Spatial databases; Speech analysis; Speech enhancement; Speech synthesis; Synthesizers; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location
Seattle, WA
ISSN
1520-6149
Print_ISBN
0-7803-4428-6
Type
conf
DOI
10.1109/ICASSP.1998.674420
Filename
674420
Link To Document