Title :
Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis
Author :
Pollard, M.P. ; Cheetham, B.M.G. ; Goodyear, C.C. ; Edgington, M.D. ; Lowry, A.
Author_Institution :
Dept. of Electr. Eng. & Electron., Liverpool Univ., UK
Abstract :
To preserve shape invariance when performing pitch or time-scale modification of sinusoidally-modelled voiced speech, the phases of the sinusoids used to model the glottal excitation are made to add coherently at estimated excitation points. Previous methods achieved this by estimating excitation phases at synthesis frame boundaries, disregarding the frequency modulation that may occur between the frame boundary and the nearest modified excitation point. This approximation can produce a significant misalignment of the excitation phases, leading to distortion of the temporal structure of the synthetic speech. In this paper, a shape-invariant technique is proposed which aligns the excitation phases at excitation points, whilst allowing for variations in the frequency of the sinusoidal components
Keywords :
frequency modulation; speech synthesis; coherently additive phases; concatenative speech synthesis; distorted temporal structure; excitation phase alignment; excitation points; frequency modulation; glottal excitation; pitch modification; shape invariance; sinusoidal component frequency variations; sinusoidally-modelled voiced speech; synthesis frame boundaries; time-scale modification; Frequency estimation; Frequency modulation; Laboratories; Phase distortion; Phase estimation; Power harmonic filters; Shape; Speech analysis; Speech recognition; Speech synthesis;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607884