DocumentCode :
968118
Title :
Encoding speech using prototype waveforms
Author :
Kleijn, W. Bastiaan
Author_Institution :
Speech Res. Dept., AT&T Bell Labs., Murray Hill, NJ, USA
Volume :
1
Issue :
4
fYear :
1993
fDate :
10/1/1993 12:00:00 AM
Firstpage :
386
Lastpage :
399
Abstract :
Voiced speech is interpreted as a concentration of slowly evolving pitch-cycle waveforms. This signal can be reconstructed by interpolation from a downsampled sequence of pitch-cycle waveforms with a rate of one prototype waveform per 20-30 ms interval. The prototype waveform is described by a set of linear-prediction (LP) filter coefficients describing the formant structure and a prototype excitation waveform, quantized with analysis-by-synthesis procedures. The speech signal is reconstructed by filtering an excitation signal consisting of the concatenation of (infinitesimal) sections of the instantaneous excitation waveforms. To obtain the correct level of periodicity, the short-term and the long-term correlations between the instantaneous excitation waveforms can be controlled explicitly. Thus, distortions such as noise, reverberation, and buzziness can be prevented. The coding method is easily combined with existing LP-based speech coders, such as CELP, for unvoiced signals. Excellent voiced speech quality is obtained at rates between 3.0 and 4.0 kb/s
Keywords :
linear predictive coding; speech coding; 3 to 4 kbit/s; CELP; analysis-by-synthesis procedures; concatenation; excitation signal; instantaneous excitation waveforms; interpolation; linear-prediction filter coefficients; pitch-cycle waveforms; prototype waveforms; speech coding; speech signal reconstruction; unvoiced signals; voiced speech; Bit rate; Encoding; Filters; Interpolation; Prototypes; Reverberation; Signal sampling; Speech analysis; Speech coding; Speech processing;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.242484
Filename :
242484
Link To Document :
بازگشت