Title :
New techniques for sinusoidal coding of speech at 2400 bps
Author :
Ahmadi, Sassan ; Spanias, Andreas S.
Author_Institution :
Dept. of Electr. Eng., Arizona State Univ., Tempe, AZ, USA
Abstract :
The sinusoidal transform coding (STC) is a frequency-domain speech compression technique, in which finite duration segments of speech signal are represented by a linear combination of sinusoids with time-varying amplitudes, phases, and frequencies. The STC is known to produce reconstructed speech of high quality at data rates below 10 kbps. It can be shown that if the measured sine wave frequencies are replaced by a harmonic set, then reconstructed speech of good quality can still be obtained. The methods that are discussed in this paper have been exploited in the development of the STC coders at data rates from 9.6 to 2.4 kbps and resulted in reconstructed speech of high quality and intelligibility. Accurate pitch detection algorithm, perception-based split vector quantization, improved overlap/add and frame interpolation algorithms, minimum variance phase estimation, and finally computational efficiency are the basic features that discriminate our implementations from other implementations of sinusoidal coders. This paper focuses on the development of a fully quantized sinusoidal coder at 2.4 kbps.
Keywords :
cepstral analysis; interpolation; phase estimation; signal reconstruction; speech coding; speech intelligibility; speech processing; transform coding; vector quantisation; 2.4 kbit/s; computational efficiency; finite duration segments; frame interpolation algorithms; frequency-domain speech compression technique; high quality speech; low bit rate coding; minimum variance phase estimation; perception-based split vector quantization; pitch detection algorithm; reconstructed speech; sinusoidal transform coding; speech coding; speech intelligibility; time-varying amplitude; time-varying frequency; time-varying phase; Cepstral analysis; Detection algorithms; Frequency; Interpolation; Linear predictive coding; Phase estimation; Signal analysis; Speech analysis; Speech coding; Transform coding;
Conference_Titel :
Signals, Systems and Computers, 1996. Conference Record of the Thirtieth Asilomar Conference on
Conference_Location :
Pacific Grove, CA, USA
Print_ISBN :
0-8186-7646-9
DOI :
10.1109/ACSSC.1996.601158