DocumentCode
310679
Title
Natural quality variable-rate spectral speech coding below 3.0 kbps
Author
Erzin, Engin ; Kumar, Arun ; Gersho, Allen
Author_Institution
Lucent Technol., Murray Hill, NJ, USA
Volume
2
fYear
1997
fDate
21-24 Apr 1997
Firstpage
1579
Abstract
We propose new techniques for natural quality variable rate spectral speech coding at an average rate of 2.2 kbps for dialog speech and 2.8 kbps for monolog speech. The coder models the Fourier spectrum of each frame and it builds on recent enhancements to the classical multiband excitation (MBE) approach. New techniques for robust pitch estimation and tracking, for efficient quantization of voiced and unvoiced spectra and encoding of partial phase information are the key features that result in improved quality over earlier spectral vocoders. Subjective performance results are reported which show that the coder is very close in quality to the ITU-T G.723.1 algorithm at 5.3 kbps
Keywords
Fourier transform spectra; cepstral analysis; rate distortion theory; spectral analysis; speech coding; speech processing; variable rate codes; vector quantisation; vocoders; 2.2 kbit/s; 2.8 kbit/s; Fourier spectrum; ITU-T G.723.1 algorithm; dialog speech; efficient quantization; monolog speech; multiband excitation; natural quality speech coding; partial phase information encoding; pitch tracking; reduced dimension vector quantisation; robust pitch estimation; subjective performance results; unvoiced spectra; variable-rate spectral speech coding; vocoders; voiced spectra; Bit rate; Cepstral analysis; Code standards; Encoding; Phase estimation; Quantization; Robustness; Spectral shape; Speech coding; Speech enhancement;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location
Munich
ISSN
1520-6149
Print_ISBN
0-8186-7919-0
Type
conf
DOI
10.1109/ICASSP.1997.596254
Filename
596254
Link To Document