Title :
Predictive and mel-scale binary vector quantization of variable dimension spectral magnitude
Author :
Cho, Yong Duk ; Kim, Moo Young ; Kondoz, Ahmet
Author_Institution :
CCSR, Surrey Univ., Guildford, UK
Abstract :
In sinusoidal speech coding, the LP-spectral envelope is limited in its spectral accuracy if the order of the LP-model is not high enough. Thus the quantization of the residual spectrum of the low order LP-model may be desirable for good quality speech reconstruction. From the investigation of the magnitude of the LP-residual spectrum, it is found that the predictive coding scheme is useful for removing coding redundancy considerably. The problem of having a variable number of harmonics due to pitch changes can be alleviated by a length warping technique. Subsequently, the residual spectrum of the predictive coding is represented by mel-scale binary vector quantizer (MBVQ), which quantizes the residual spectrum by splitting harmonic bands of variable dimension into a fixed dimension, based on mel scale, and representing each element of the code vector as a binary value. The optimal code vector for the MBVQ can be derived by minimizing an error measure, defined as the weighted square-sum of the difference between original and synthesized spectral envelopes. From the performance evaluation, it is shown that the predictive-coded MBVQ with low order LP can obtain the effect of considerably high order LP-model. Additionally, the proposed method can be implemented with very low computational complexity in time and space
Keywords :
computational complexity; linear predictive coding; spectral analysis; speech coding; vector quantisation; LP-spectral envelope; coding redundancy; harmonics; length warping technique; low computational complexity; mel-scale binary vector quantization; optimal code vector; performance evaluation; pitch changes; predictive coding; residual spectrum; sinusoidal speech coding; speech reconstruction; variable dimension spectral magnitude; weighted square-sum; Computational complexity; Distortion measurement; Human computer interaction; Interpolation; Predictive coding; Predictive models; Redundancy; Speech coding; Vector quantization; Vocoders;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.861894