Title :
Audio Segmentation in AAC Domain for Content Analysis
Author :
Zhu, Rong ; Ai, Haojun ; Hu, Ruimin
Author_Institution :
Nat. Eng. Res. Center for Multimedia Software, Wuhan Univ., Wuhan, China
Abstract :
We focus the attention on the audio scene segmentation in AAC domain for audio-based multimedia indexing and retrieval applications. In particular, a MFCC extraction method is proposed, which is adaptive to the window switch in AAC encoding process, and independent of the audio sampling frequency. We discuss the fusion method of MFCC features, which came from different window type in order to keep the balance of the frequency and temporal resolution. A series of experiments via the probability distribution of MFCC were implemented to test the effective in audio scene segmentation. The experimental results show that such approach based on compression domain can approach the performance of the system based on PCM audio, and the CPU overload decreased dramatically. It is meaningful to the real time analysis of audio content.
Keywords :
audio coding; data compression; discrete cosine transforms; feature extraction; indexing; information retrieval; multimedia computing; sensor fusion; signal resolution; signal sampling; statistical distributions; AAC compression domain; AAC encoding process; CPU overload; MDCT domain; MFCC feature extraction algorithm; MFCC fusion method; PCM audio; adaptive window switch; audio sampling frequency; audio scene segmentation; audio-based multimedia indexing; audio-based multimedia retrieval; frequency resolution; probability distribution; real-time audio content analysis; temporal resolution; Application software; Audio coding; Content based retrieval; Encoding; Indexing; Layout; Mel frequency cepstral coefficient; Probability distribution; Sampling methods; Switches;
Conference_Titel :
Wireless Communications, Networking and Mobile Computing, 2009. WiCom '09. 5th International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-3692-7
Electronic_ISBN :
978-1-4244-3693-4
DOI :
10.1109/WICOM.2009.5301778