DocumentCode
1548022
Title
Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model
Author
George, E. Bryan ; Smith, Mark J T
Author_Institution
Signal Process. Center of Technol., Lockheed-Martin Inc., Nashua, NH, USA
Volume
5
Issue
5
fYear
1997
fDate
9/1/1997 12:00:00 AM
Firstpage
389
Lastpage
406
Abstract
Sinusoidal modeling has been successfully applied to a broad range of speech processing problems, and offers advantages over linear predictive modeling and the short-time Fourier transform for speech analysis/synthesis and modification. This paper presents a novel speech analysis/synthesis system based on the combination of an overlap-add sinusoidal model with an analysis-by-synthesis technique to determine the model parameters. It describes this analysis procedure in detail, and introduces an equivalent frequency-domain algorithm that takes advantage of the computational efficiency of the fast Fourier transform (FFT). In addition, a refined overlap-add sinusoidal model capable of shape-invariant speech modification is derived, and a pitch-scale modification algorithm is defined that preserves speech bandwidth and eliminates noise migration effects. Analysis-by-synthesis achieves very high synthetic speech quality by accurately estimating the component frequencies, eliminating sidelobe interference effects, and effectively dealing with nonstationary speech events. The refined overlap-add synthesis model correlates well with analysis-by-synthesis, and modifies speech without objectionable artifacts by explicitly controlling shape invariance and phase coherence. The proposed analysis-by-synthesis/overlap-add (ABS/OLA) system allows for both fixed and time-varying time-, frequency-, and pitch-scale modifications, and computational shortcuts using the FFT algorithm make its implementation feasible using currently available hardware
Keywords
correlation methods; fast Fourier transforms; frequency estimation; speech intelligibility; speech processing; speech synthesis; FFT algorithm; analysis by synthesis model; computational efficiency; correlation; fast Fourier transform; frequency estimation; frequency-domain algorithm; model parameters; nonstationary speech events; overlap-add sinusoidal model; overlap-add synthesis model; phase coherence; pitch scale modification algorithm; shape invariant speech modification; sidelobe interference effects; sinusoidal modeling; speech analysis/synthesis system; speech bandwidth; speech processing; synthetic speech quality; time varying frequency scale modification; time varying time scale modification; Algorithm design and analysis; Fourier transforms; Frequency domain analysis; Frequency estimation; Predictive models; Shape control; Speech analysis; Speech enhancement; Speech processing; Speech synthesis;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.622558
Filename
622558
Link To Document