DocumentCode
629057
Title
Audio event detection in movies using multiple audio words and contextual Bayesian networks
Author
Penet, Cedric ; Demarty, Claire-Helene ; Gravier, Guillaume ; Gros, Petra
Author_Institution
Technicolor R&D France, Cesson-Sevigne, France
fYear
2013
fDate
17-19 June 2013
Firstpage
17
Lastpage
22
Abstract
This article investigates a novel use of the well-known audio words representations to detect specific audio events, namely gunshots and explosions, in order to get more robustness towards soundtrack variability in Hollywood movies. An audio stream is processed as a sequence of stationary segments. Each segment is described by one or several audio words obtained by applying product quantization to standard features. Such a representation using multiple audio words constructed via product quantisation is one of the novelties described in this work. Based on this representation, Bayesian networks are used to exploit the contextual information in order to detect audio events. Experiments are performed on a comprehensive set of 15 movies, made publicly available. Results are comparable to the state of the art results obtained on the same dataset but show increased robustness to decision thresholds, however limiting the range of possible operating points in some conditions. Late fusion provides a solution to this issue.
Keywords
audio signal processing; audio streaming; belief networks; decision making; entertainment; quantisation (signal); sensor fusion; sequences; Bayesian networks; Hollywood movies; audio event detection; audio stream processing; audio word representation; contextual Bayesian networks; contextual information; decision threshold; explosions; gunshots; multiple audio word construction; product quantization; soundtrack variability; stationary segment sequence; Dictionaries; Event detection; Explosions; Feature extraction; Mel frequency cepstral coefficient; Motion pictures; Quantization (signal);
fLanguage
English
Publisher
ieee
Conference_Titel
Content-Based Multimedia Indexing (CBMI), 2013 11th International Workshop on
Conference_Location
Veszprem
ISSN
1949-3983
Print_ISBN
978-1-4799-0955-1
Type
conf
DOI
10.1109/CBMI.2013.6576546
Filename
6576546
Link To Document