Title :
A scalable coder designed for 10-kHz bandwidth speech
Author :
Oshikiri, Masahiro ; Ehara, Hiroyuki ; Yoshida, Koji
Author_Institution :
Matsushita Commun. Ind. Co. Ltd., Yokosuka, Japan
Abstract :
This paper presents a scalable speech coder with rate of 23.85-kbit/s to encode 10-kHz bandwidth speech signals. The perceptual quality of the 10-kHz bandwidth speech signals is much better than that of 7-kHz bandwidth ones, and it is close to that of 20-kHz bandwidth ones. The 10-kHz bandwidth is therefore promising for high-fidelity conversational applications. The scalable coder consists of two layers: a base-layer and an enhancement-layer. The adaptive multi-rate wideband speech coder (AMR-WB) at 15.85-kbit/s and a transform coding method at 8-kbit/s are utilized for the base-layer and the enhancement-layer, respectively. This hybrid structure ensures the efficient coding of the 10-kHz bandwidth speech. In enhancement-layer, the modified discrete cosine transform (MDCT) is exploited. Its analysis frame size is set to be short in order to minimize additional algorithmic delay. The total additional algorithmic delay of the enhancement-layer is 5-ms. Since it is difficult to quantize all the MDCT coefficients at 8-kbit/s, we have limited the region for quantization from 6-kHz to 9-kHz to improve the perceptual quality of decoded speech. Our subjective evaluation test results indicate the quality of the proposed coder clearly exceeds that of AMR-WB at 23.85-kbit/s under both clean and noise conditions.
Keywords :
adaptive codes; delays; discrete cosine transforms; minimisation; speech coding; transform coding; variable rate codes; vocoders; 10 kHz; 23.85 kbit/s; 6 to 9 kHz; AMR-WB; MDCT; adaptive multi-rate wideband speech coder; algorithmic delay minimization; base layer; enhancement layer; high-fidelity conversational applications; modified discrete cosine transform; perceptual quality; scalable coder; speech coder; speech coding; transform coding; Added delay; Algorithm design and analysis; Bandwidth; Decoding; Discrete cosine transforms; Quantization; Speech coding; Speech enhancement; Transform coding; Wideband;
Conference_Titel :
Speech Coding, 2002, IEEE Workshop Proceedings.
Print_ISBN :
0-7803-7549-1
DOI :
10.1109/SCW.2002.1215741