DocumentCode
2314855
Title
Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis
Author
Pollard, M.P. ; Cheetham, B.M.G. ; Goodyear, C.C. ; Edgington, M.D. ; Lowry, A.
Author_Institution
Dept. of Electr. Eng. & Electron., Liverpool Univ., UK
Volume
3
fYear
1996
fDate
3-6 Oct 1996
Firstpage
1433
Abstract
To preserve shape invariance when performing pitch or time-scale modification of sinusoidally-modelled voiced speech, the phases of the sinusoids used to model the glottal excitation are made to add coherently at estimated excitation points. Previous methods achieved this by estimating excitation phases at synthesis frame boundaries, disregarding the frequency modulation that may occur between the frame boundary and the nearest modified excitation point. This approximation can produce a significant misalignment of the excitation phases, leading to distortion of the temporal structure of the synthetic speech. In this paper, a shape-invariant technique is proposed which aligns the excitation phases at excitation points, whilst allowing for variations in the frequency of the sinusoidal components
Keywords
frequency modulation; speech synthesis; coherently additive phases; concatenative speech synthesis; distorted temporal structure; excitation phase alignment; excitation points; frequency modulation; glottal excitation; pitch modification; shape invariance; sinusoidal component frequency variations; sinusoidally-modelled voiced speech; synthesis frame boundaries; time-scale modification; Frequency estimation; Frequency modulation; Laboratories; Phase distortion; Phase estimation; Power harmonic filters; Shape; Speech analysis; Speech recognition; Speech synthesis;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607884
Filename
607884
Link To Document