Scalable Audio Compression at Low Bitrates

Author

Kandadai, Srivatsan ; Creusere, Charles D.

Author_Institution

Klipsch Sch. of Electr. & Comput. Eng., New Mexico State Univ., Las Cruces, NM

Volume

16

Issue

5

fYear

2008

fDate

7/1/2008 12:00:00 AM

Firstpage

969

Lastpage

979

Abstract

A perceptually scalable audio coder generates a bit-stream that contains layers of audio fidelity and is encoded in such a way that adding one of these layers enhances the reconstructed audio by an amount that is just noticeable by the listener. Such algorithms have applications like music on demand at variable levels of fidelity, for instance using 3G and 4G cellular radio systems operating at different bit rates. While the MPEG-4 natural audio coder can create finely scalable bit streams using bit sliced arithmetic coding (BSAC), its perceptual quality at low bit rates is poor. On the other hand, the nonscalable transform-domain weighted interleaved vector quantization (TWIN-VQ) performs well at low bit rates. In this paper, we present a modified version of TWIN-VQ algorithm that generates a perceptually scalable bit-stream with many fine layers of audio fidelity. Using TWIN-VQ as our base ensures the best possible perceptual quality at low bit rates. Specifically, the proposed scalable algorithm performs as well as TWIN-VQ at rates of 8 to 16 kb/s and outperforms scalable BSAC by between 64% and 172% at rates of less than 24 kb/s.

Keywords

3G mobile communication; 4G mobile communication; audio coding; data compression; vector quantisation; 3G cellular radio systems; 4G cellular radio systems; MPEG-4 natural audio coder; audio fidelity; bit sliced arithmetic coding; interleaved vector quantization; low bitrates; scalable audio compression; scalable bit streams; Arithmetic; Audio coding; Audio compression; Bit rate; Channel capacity; Land mobile radio cellular systems; MPEG 4 Standard; Scalability; Streaming media; Vector quantization; Objective audio quality metrics; perceptual coding; scalability; transform-domain weighted interleaved vector quantization (TWIN-VQ); vector quantization;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TASL.2008.925881

Filename

4544823