DocumentCode :
3474426
Title :
The waveform interpolation paradigm. Foundation of a class of speech coders
Author :
Burnett, Ian
Author_Institution :
Dept. of Electr., Comput. & Telecommun. Eng., Wollongong Univ., NSW, Australia
Volume :
1
fYear :
1997
fDate :
4-4 Dec. 1997
Abstract :
Summary form only given. A new generation of speech coding algorithms, offering high-quality speech compression at bit rates as low as 2.4 kb/s, has been developed. The algorithms which have been successful at these rates bear little resemblance to those developed for use at higher rates such as 8 kb/s. In particular, the use of CELP and its derivative architectures at rates of 2.4 kb/s has proved ineffective. This is primarily due to the low availability of bits to adequately represent pitch periodicity in low rate CELP implementations. Thus, for effective, high perceptual quality coding of speech at 2.4 kb/s it has been necessary to develop coding algorithms with inherent periodicity rather than depend on the correlation techniques used to generate periodicity in CELP. Further, to achieve low-rates while maintaining quality, the algorithms must exploit the evolutionary nature of speech in a manner not seen in e.g. CELP coders. The waveform interpolation (WI) paradigm offers both of these properties by representing speech or, more often, the LP residual as an evolving set of pitch cycle waveforms (known as prototype or characteristic waveforms). This explicit method of describing the pitch of the speech is reminiscent of first-generation vocoding algorithms but, while WI utilises many familiar concepts, such as LP coding and subsequent LSF quantisation, the majority of the concepts are new to speech coding. The formation of the speech/residual into an evolving surface of phase-aligned characteristic waveforms and the subsequent decomposition of that surface into near-independent slowly and rapidly evolving surfaces for quantisation is, perhaps, the most distinctive feature of WI coding. Further, the technique attains high quality at low rates by utilising smooth interpolation of almost all its parameters requiring careful consideration of events such as pitch doubling. The technique is a truly hybrid speech coding algorithm, performing analysis in both the time and- discrete frequency domains. In particular the use of Fourier descriptions of the prototypes allows effective phase alignment, and interpolation between characteristic waveforms of varying pitch.
Keywords :
speech coding; 2.4 kbit/s; CELP; Fourier descriptions; LP coding; LP residual; LSF quantisation; bit rates; characteristic waveforms; discrete frequency domain analysis; evolving surface; first-generation vocoding algorithms; high perceptual quality coding; high-quality speech compression; hybrid speech coding algorithm; phase alignment; pitch cycle waveforms; pitch doubling; pitch periodicity; prototype waveforms; speech coders; surface decomposition; time domain analysis; waveform interpolation; Algorithm design and analysis; Bit rate; Frequency domain analysis; Interpolation; Performance analysis; Prototypes; Quantization; Speech analysis; Speech coding; Surface waves;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications., Proceedings of IEEE
Conference_Location :
Brisbane, Qld., Australia
Print_ISBN :
0-7803-4365-4
Type :
conf
DOI :
10.1109/TENCON.1997.647244
Filename :
647244
Link To Document :
بازگشت