Title : 
Improved audio coding using a psychoacoustic model based on a cochlear filter bank
         
        
            Author : 
Baumgarte, Frank
         
        
            Author_Institution : 
Media Signal Process. Res. Dept., Agere Syst., Berkeley Heights, NJ, USA
         
        
        
        
        
            fDate : 
10/1/2002 12:00:00 AM
         
        
        
        
            Abstract : 
Perceptual audio coders use an estimated masked threshold for the determination of the maximum permissible just-inaudible noise level introduced by quantization. This estimate is derived from a psychoacoustic model mimicking the properties of. masking. Most psychoacoustic models for coding applications use a uniform (equal bandwidth) spectral decomposition as a first step to approximate the frequency selectivity of the human auditory system. However, the equal filter properties of the uniform subbands do not match the nonuniform characteristics of cochlear filters and reduce the precision of psychoacoustic modeling. Even so, uniform filter banks are applied because they are computationally efficient. This paper presents a psychoacoustic model based on an efficient nonuniform cochlear filter bank and a simple masked threshold estimation. The novel filter-bank structure employs cascaded low-order IIR filters and appropriate down-sampling to increase efficiency. The filter responses are optimized for the modeling of auditory masking effects. Results of the new psychoacoustic model applied to audio coding show better performance in terms of bit rate and/or quality of the new model in comparison with other state-of-the-art models using a uniform spectral decomposition. The low delay of the new model is particularly suitable for low-delay coders.
         
        
            Keywords : 
IIR filters; acoustic signal processing; audio coding; cascade networks; channel bank filters; data compression; delays; ear; filtering theory; hearing; noise; quantisation (signal); signal sampling; spectral analysis; audio coding; auditory masking effects; bit rate; cascaded low-order IIR filters; cochlear filter bank; down-sampling; equal bandwidth; equal filter properties; estimated masked threshold; frequency selectivity; human auditory system; low-delay coders; maximum permissible just-inaudible noise level; nonuniform characteristics; nonuniform cochlear filter bank; perceptual audio coders; psychoacoustic model; quantization; uniform filter banks; uniform spectral decomposition; uniform subbands; Audio coding; Bandwidth; Filter bank; Frequency; Humans; IIR filters; Matched filters; Noise level; Psychoacoustic models; Quantization;
         
        
        
            Journal_Title : 
Speech and Audio Processing, IEEE Transactions on
         
        
        
        
        
            DOI : 
10.1109/TSA.2002.804536