Auditory scene analysis and recognition with LDA topic model

Author

Feng Su

Author_Institution

State Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China

fYear

2014

fDate

14-18 July 2014

Firstpage

1

Lastpage

6

Abstract

Analysis and recognition of auditory scenes play an important role in content-based multimedia processing and context-aware applications. In this paper, we propose an auditory scene recognition scheme that integrates the analysis of the audio data of scene with LDA topic model to discover latent structures (i.e. contextual correlations) of audio words, and generation of intermediate contextual descriptions of audio data on basis of the topics learnt by LDA. We further combine the piecewise low-level audio feature and the contextual feature, and discriminatively classify an audio clip of an unknown scene that is represented as a set of these features using the Hough forest model. The experimental results demonstrate the effectiveness of the proposed scheme, which combines the unsupervised topic modeling by LDA and the supervised classification of auditory scene by Hough forest.

Keywords

audio signal processing; signal classification; Hough forest model; LDA topic model; audio clip classification; audio words; auditory scene analysis; auditory scene recognition; content-based multimedia processing; context-aware applications; contextual feature; intermediate contextual description generation; latent structure discovery; piecewise low-level audio feature; supervised classification; unsupervised topic modeling; Accuracy; Context modeling; Correlation; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Auditory scene; LDA; environmental sound; hough forest; local discriminant bases;

fLanguage

English

Publisher

ieee

Conference_Titel

Multimedia and Expo (ICME), 2014 IEEE International Conference on

Conference_Location

Chengdu

Type

conf

DOI

10.1109/ICME.2014.6890241

Filename

6890241