DocumentCode
1700393
Title
Automatic Audio-Visual Fusion for Aggression Detection Using Meta-information
Author
Lefter, Iulia ; Burghouts, Gertjan J. ; Rothkrantz, Leon J M
Author_Institution
Delft Univ. of Technol., Delft, Netherlands
fYear
2012
Firstpage
19
Lastpage
24
Abstract
We propose a new method for audio-visual sensor fusion and apply it to automatic aggression detection. While a variety of definitions of aggression exist, in this paper we see it as any kind of behavior that has a disturbing effect on others. We have collected multi- and unimodal assessments by humans, who have given aggression scores on a 3 point scale. There are no trivial fusion algorithms to predict the multimodal labels from the unimodal labels. We propose an intermediate step to discover the structure in the fusion process. We call these meta-features and we find a set of five which have an impact on the fusion process. We use simple state of the art low level audio and video features to predict the level of aggression in audio and video, and we also predict the three most feasible meta-features. We show the significant positive impact of adding the meta-features on predicting the multimodal label as compared to standard fusion techniques like feature and decision level fusion.
Keywords
audio signal processing; feature extraction; image fusion; object detection; video signal processing; audio features; automatic aggression detection; automatic audio-visual sensor fusion; decision level fusion; feature level fusion; meta-features; meta-information; multimodal label; video features; Accuracy; Context; Databases; History; Humans; Semantics; Vectors; aggression detection; audio-visual sensor fusion; high level fusion; surveillance;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Video and Signal-Based Surveillance (AVSS), 2012 IEEE Ninth International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4673-2499-1
Type
conf
DOI
10.1109/AVSS.2012.13
Filename
6327978
Link To Document