Title :
An analysis-by-synthesis encoding approach for multiple audio objects
Author :
Ziyu Yang;Maoshen Jia;Changchun Bao;Wenbei Wang
Author_Institution :
Speech and Audio Signal Processing Laboratory, College of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100124, China
Abstract :
Object-based audio techniques are becoming popular as they provide the flexibility for personalized rendering. For encoding multiple audio objects, a recent approach based on the intra-object sparsity was proposed. However, the allocation strategy of the number of preserved time-frequency (TF) instants (NPTF) utilized in this approach usually leads to an unbalanced perceptual quality for the decoded audio objects. To overcome this issue, an analysis-by-synthesis (ABS) encoding approach for multiple audio objects is proposed in this work. By using the ABS framework, the allocated NPTF for each object is adjusted through an iterative processing, such that the maximum difference of preserved frame energy among all objects is minimized. Thereafter, multiple audio objects are encoded into a downmix signal plus side information. Both objective and subjective evaluations validated that the proposed approach is robust to different types of audio objects whilst confirming all the decoded audio objects with similar perceptual quality.
Keywords :
"Encoding","Resource management","MONOS devices","Rendering (computer graphics)","Data mining","Codecs","Discrete Fourier transforms"
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
DOI :
10.1109/APSIPA.2015.7415383