DocumentCode
968118
Title
Encoding speech using prototype waveforms
Author
Kleijn, W. Bastiaan
Author_Institution
Speech Res. Dept., AT&T Bell Labs., Murray Hill, NJ, USA
Volume
1
Issue
4
fYear
1993
fDate
10/1/1993 12:00:00 AM
Firstpage
386
Lastpage
399
Abstract
Voiced speech is interpreted as a concentration of slowly evolving pitch-cycle waveforms. This signal can be reconstructed by interpolation from a downsampled sequence of pitch-cycle waveforms with a rate of one prototype waveform per 20-30 ms interval. The prototype waveform is described by a set of linear-prediction (LP) filter coefficients describing the formant structure and a prototype excitation waveform, quantized with analysis-by-synthesis procedures. The speech signal is reconstructed by filtering an excitation signal consisting of the concatenation of (infinitesimal) sections of the instantaneous excitation waveforms. To obtain the correct level of periodicity, the short-term and the long-term correlations between the instantaneous excitation waveforms can be controlled explicitly. Thus, distortions such as noise, reverberation, and buzziness can be prevented. The coding method is easily combined with existing LP-based speech coders, such as CELP, for unvoiced signals. Excellent voiced speech quality is obtained at rates between 3.0 and 4.0 kb/s
Keywords
linear predictive coding; speech coding; 3 to 4 kbit/s; CELP; analysis-by-synthesis procedures; concatenation; excitation signal; instantaneous excitation waveforms; interpolation; linear-prediction filter coefficients; pitch-cycle waveforms; prototype waveforms; speech coding; speech signal reconstruction; unvoiced signals; voiced speech; Bit rate; Encoding; Filters; Interpolation; Prototypes; Reverberation; Signal sampling; Speech analysis; Speech coding; Speech processing;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.242484
Filename
242484
Link To Document