Title :
Natural quality variable-rate spectral speech coding below 3.0 kbps
Author :
Erzin, Engin ; Kumar, Arun ; Gersho, Allen
Author_Institution :
Lucent Technol., Murray Hill, NJ, USA
Abstract :
We propose new techniques for natural quality variable rate spectral speech coding at an average rate of 2.2 kbps for dialog speech and 2.8 kbps for monolog speech. The coder models the Fourier spectrum of each frame and it builds on recent enhancements to the classical multiband excitation (MBE) approach. New techniques for robust pitch estimation and tracking, for efficient quantization of voiced and unvoiced spectra and encoding of partial phase information are the key features that result in improved quality over earlier spectral vocoders. Subjective performance results are reported which show that the coder is very close in quality to the ITU-T G.723.1 algorithm at 5.3 kbps
Keywords :
Fourier transform spectra; cepstral analysis; rate distortion theory; spectral analysis; speech coding; speech processing; variable rate codes; vector quantisation; vocoders; 2.2 kbit/s; 2.8 kbit/s; Fourier spectrum; ITU-T G.723.1 algorithm; dialog speech; efficient quantization; monolog speech; multiband excitation; natural quality speech coding; partial phase information encoding; pitch tracking; reduced dimension vector quantisation; robust pitch estimation; subjective performance results; unvoiced spectra; variable-rate spectral speech coding; vocoders; voiced spectra; Bit rate; Cepstral analysis; Code standards; Encoding; Phase estimation; Quantization; Robustness; Spectral shape; Speech coding; Speech enhancement;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
Print_ISBN :
0-8186-7919-0
DOI :
10.1109/ICASSP.1997.596254