Title :
Scalable Multiband Binaural Renderer for MPEG-H 3D Audio
Author :
Taegyu Lee ; Hyun Oh Oh ; Jeongil Seo ; Young-cheol Park ; Dae Hee Youn
Author_Institution :
Electr. & Electron. Eng. Dept., Yonsei Univ., Seoul, South Korea
Abstract :
To provide immersive 3D multimedia service, MPEG has launched MPEG-H, ISO/IEC 23008, “High Efficiency Coding and Media Delivery in Heterogeneous Environments.” As part of the audio, MPEG-H 3D Audio has been standardized based on a multichannel loudspeaker configuration (e.g., 22.2). Binaural rendering is a key application of 3D audio; however, previous studies focus on binaural rendering with low complexity such as IIR filter design for HRTF or pre-/post-processing to solve in-head localization or front-back confusion. In this paper, a new binaural rendering algorithm is proposed to support the large number of input channel signals and provide high-quality in terms of timbre, parts of this algorithm were adopted into the MPEG-H 3D Audio. The proposed algorithm truncates binaural room impulse response at mixing time, the transition point from the early-reflections to the late reverberation part. Each part is processed independently by variable order filtering in frequency domain (VOFF) and parametric late reverberation filtering (PLF), respectively. Further, a QMF domain tapped delay line (QTDL) is proposed to reduce complexity in the high-frequency band, based on human auditory perception and codec characteristics. In the proposed algorithm, a scalability scheme is adopted to cover a wide range of applications by adjusting the threshold of mixing time. Experimental results show that the proposed algorithm is able to provide the audio quality of a binaural rendered signal using full-length binaural room impulse responses. A scalability test also shows that the proposed scalability scheme smoothly compromises between audio quality and computational complexity.
Keywords :
filtering theory; frequency-domain analysis; video coding; MPEG-H 3D audio; PLF; QMF domain tapped delay line; QTDL; VOFF; binaural room impulse response; high efficiency coding and media delivery in heterogeneous environments; multichannel loudspeaker configuration; parametric late reverberation filtering; scalable multiband binaural renderer; variable order filtering in frequency domain; Decoding; Loudspeakers; Rendering (computer graphics); Reverberation; Three-dimensional displays; Time measurement; Transform coding; Binaural rendering; MPEG-H 3D Audio; headphones; multi-channel;
Journal_Title :
Selected Topics in Signal Processing, IEEE Journal of
DOI :
10.1109/JSTSP.2015.2425799