Title :
Multimodal information fusion and temporal integration for violence detection in movies
Author :
Penet, Cédric ; Demarty, Claire-Hélène ; Gravier, Guillaume ; GROS, Patrick
Author_Institution :
Technicolor Cesson-Sevigne, Cesson-Sevigne, France
Abstract :
This paper presents a violent shots detection system that studies several methods for introducing temporal and multimodal information in the framework. It also investigates different kinds of Bayesian network structure learning algorithms for modelling these problems. The system is trained and tested using the MediaEval 2011 Affect Task corpus, which comprises of 15 Hollywood movies. It is experimentally shown that both multimodality and temporality add interesting information into the system. Moreover, the analysis of the links between the variables of the resulting graphs yields important observations about the quality of the structure learning algorithms. Overall, our best system achieved 50% false alarms and 3% missed detection, which is among the best submissions in the MediaEval campaign.
Keywords :
belief networks; cinematography; graph theory; learning (artificial intelligence); object detection; sensor fusion; video signal processing; Bayesian network structure learning algorithm; Hollywood movies; MediaEval 2011 Affect Task corpus; false alarm; graph; missed detection; multimodal information fusion; temporal information; temporal integration; violence detection; violent shot detection system; Algorithm design and analysis; Ash; Bayesian methods; Feature extraction; Image color analysis; Measurement; Motion pictures; Bayesian networks; multimodal fusion; structure learning; temporal integration; violence detection;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288397