مرکز منطقه ای اطلاع رساني علوم و فناوري - Companded quantization of speech MDCT coefficients

DocumentCode :

1239806

Title :

Companded quantization of speech MDCT coefficients

Author :

Nordén, Fredrik ; Hedelin, Per

Author_Institution :

Dept. of Commun. Technol., Aalborg Univ., Denmark

Volume :

Issue :

fYear :

2005

fDate :

3/1/2005 12:00:00 AM

Firstpage :

163

Lastpage :

173

Abstract :

Here, we propose speech-coding procedures achieving high subjective quality, avoiding speech-specific processing and interframe exploitation. Thus, the scheme is tractable for packet-based voice communication, and has the capability of coding generic audio. The architecture is based on an modified discrete cosine transform (MDCT) representation of the signal, and combines efficient vector quantization (VQ) techniques with psychoacoustic principles. Weighted quantization of MDCT coefficients is performed, using a codebook based on a statistical model of the multidimensional MDCT pdf. The weighting and the codebook are adapted for each frame to account for masking thresholds given by a psychoacoustic analysis. Actual quantization is performed using lattices, thereby, achieving close to rate independent complexity. The result is a coding scheme operational at a range of rates. Here, a particular instance at 16 kbits/s, using a sampling frequency of 8 kHz, is shown to perform better than an LD-CELP operating at the same rate, even though no interframe memory is exploited.

Keywords :

audio coding; computational complexity; discrete cosine transforms; speech coding; statistical analysis; vector quantisation; voice communication; companded quantization; generic audio coding; interframe memory exploitation; modified discrete cosine transform; multidimensional MDCT; packet-based voice communication; psychoacoustic analysis; signal representation; speech MDCT coefficient; speech-coding procedure; speech-specific processing; vector quantization technique; weighted quantization; Discrete cosine transforms; Frequency; Lattices; Masking threshold; Multidimensional systems; Psychoacoustic models; Psychology; Sampling methods; Speech; Vector quantization; Audio coding; modified discrete cosine transform (MDCT); psycho acoustics; speech-coding; statistical modeling; vector quantization (VQ);

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/TSA.2004.838535

Filename :

1395961

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1239806