DocumentCode :
32161
Title :
Random Regression Forests for Acoustic Event Detection and Classification
Author :
PHAN, HUY ANH ; Maas, Marco ; Mazur, Radoslaw ; Mertins, Alfred
Author_Institution :
Inst. for Signal Process., Univ. of Lubeck, Lubeck, Germany
Volume :
23
Issue :
1
fYear :
2015
fDate :
Jan. 2015
Firstpage :
20
Lastpage :
31
Abstract :
Despite the success of the automatic speech recognition framework in its own application field, its adaptation to the problem of acoustic event detection has resulted in limited success. In this paper, instead of treating the problem similar to the segmentation and classification tasks in speech recognition, we pose it as a regression task and propose an approach based on random forest regression. Furthermore, event localization in time can be efficiently handled as a joint problem. We first decompose the training audio signals into multiple interleaved superframes which are annotated with the corresponding event class labels and their displacements to the temporal onsets and offsets of the events. For a specific event category, a random-forest regression model is learned using the displacement information. Given an unseen superframe, the learned regressor will output the continuous estimates of the onset and offset locations of the events. To deal with multiple event categories, prior to the category-specific regression phase, a superframe-wise recognition phase is performed to reject the background superframes and to classify the event superframes into different event categories. While jointly posing event detection and localization as a regression problem is novel, the superior performance on two databases ITC-Irst and UPC-TALP demonstrates the efficiency and potential of the proposed approach.
Keywords :
acoustic signal detection; audio signal processing; learning (artificial intelligence); random processes; regression analysis; signal classification; ITC-Irst; UPC-TALP; acoustic event classification; acoustic event detection; automatic speech recognition framework; background superframe rejection; category-specific regression phase; displacement information; multiple interleaved superframe; offset location estimation; onset location estimation; random-forest regression model; superframe-wise recognition phase; training audio signal decomposition; Acoustics; Hidden Markov models; Speech; Speech processing; Training; Vectors; Vegetation; Acoustic event detection; random forest; regression forest; superframe;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
2329-9290
Type :
jour
DOI :
10.1109/TASLP.2014.2367814
Filename :
6949625
Link To Document :
بازگشت