DocumentCode :
310679
Title :
Natural quality variable-rate spectral speech coding below 3.0 kbps
Author :
Erzin, Engin ; Kumar, Arun ; Gersho, Allen
Author_Institution :
Lucent Technol., Murray Hill, NJ, USA
Volume :
2
fYear :
1997
fDate :
21-24 Apr 1997
Firstpage :
1579
Abstract :
We propose new techniques for natural quality variable rate spectral speech coding at an average rate of 2.2 kbps for dialog speech and 2.8 kbps for monolog speech. The coder models the Fourier spectrum of each frame and it builds on recent enhancements to the classical multiband excitation (MBE) approach. New techniques for robust pitch estimation and tracking, for efficient quantization of voiced and unvoiced spectra and encoding of partial phase information are the key features that result in improved quality over earlier spectral vocoders. Subjective performance results are reported which show that the coder is very close in quality to the ITU-T G.723.1 algorithm at 5.3 kbps
Keywords :
Fourier transform spectra; cepstral analysis; rate distortion theory; spectral analysis; speech coding; speech processing; variable rate codes; vector quantisation; vocoders; 2.2 kbit/s; 2.8 kbit/s; Fourier spectrum; ITU-T G.723.1 algorithm; dialog speech; efficient quantization; monolog speech; multiband excitation; natural quality speech coding; partial phase information encoding; pitch tracking; reduced dimension vector quantisation; robust pitch estimation; subjective performance results; unvoiced spectra; variable-rate spectral speech coding; vocoders; voiced spectra; Bit rate; Cepstral analysis; Code standards; Encoding; Phase estimation; Quantization; Robustness; Spectral shape; Speech coding; Speech enhancement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
ISSN :
1520-6149
Print_ISBN :
0-8186-7919-0
Type :
conf
DOI :
10.1109/ICASSP.1997.596254
Filename :
596254
Link To Document :
بازگشت