Title :
KLT-based adaptive entropy-constrained quantization with universal arithmetic coding
Author :
Lee, Yoonjoo ; Kim, Moo Young
Author_Institution :
Dept. of Inf. & Commun. Eng., Sejong Univ., Seoul, South Korea
fDate :
11/1/2010 12:00:00 AM
Abstract :
For flexible speech coding, a Karhunen-Loève Transform (KLT) based adaptive entropy-constrained quantization (KLT-AECQ) method is proposed. It is composed of backward-adaptive linear predictive coding (LPC) estimation, KLT estimation based on the time-varying LPC coefficients, scalar quantization of the speech signal in a KLT domain, and superframe-based universal arithmetic coding based on the estimated KLT statistics. To minimize the outliers both in rate and distortion, a new distortion criterion includes the penalty in the rate increase. Gain adaptive step size selection and bounded Gaussian source model also cooperate to increase the perceptual quality. KLT-AECQ does not require either any explicit codebook or a training step, thus KLT-AECQ can have an infinite number of rate-distortion operating points regardless of time-varying source statistics. For the speech signal, the conventional KLT-based classified vector quantization (KLT-CVQ) and the proposed KLT-AECQ yield signal-to-noise ratios of 17.86 and 26.22, respectively, at around 16 kbits/s. The perceptual evaluation of speech quality (PESQ) scores for each method are 3.87 and 4.04, respectively.
Keywords :
Gaussian processes; Karhunen-Loeve transforms; adaptive codes; arithmetic codes; linear codes; speech coding; vector quantisation; KLT-AECQ method; KLT-CVQ; KLT-based adaptive entropy-constrained quantization; KLT-based classified vector quantization; Karhunen-Loève Transform; LPC estimation; PESQ scores; backward-adaptive linear predictive coding; bounded Gaussian source model; estimated KLT statistics; gain adaptive step size selection; perceptual evaluation-of-speech quality; scalar quantization; signal-to-noise ratios; speech coding; superframe-based universal arithmetic coding; time-varying LPC coefficients; time-varying source statistics; Huffman coding; Quantization; Shape; Signal to noise ratio; Speech; Speech coding; Speech Coding, Karhunen-Lo??ve Transform, Entropy-Constrained Quantization, High Rate Theory.;
Journal_Title :
Consumer Electronics, IEEE Transactions on
DOI :
10.1109/TCE.2010.5681146