Title :
Browsing videos by automatically detected audio events
Author :
Barbosa, Virgínia ; Pellegrini, T. ; Bugalho, M. ; Trancoso, Isabel
Author_Institution :
IST, UTL, Lisbon, Portugal
Abstract :
This paper focuses on Audio Event Detection (AED), a research area which aims to substantially enhance the access to audio in multimedia content. With the ever-growing quantity of multimedia documents uploaded on the Web, automatic description of the audio content of videos can provide very useful information, to index, archive and search multimedia documents. Preliminary experiments with a sound effects corpus showed good results for training models. However, the performance on the real data test set, where there are overlapping audio events and continuous background noise is lower. This paper describes the AED framework and methodologies used to build 6 Audio Event detectors, based on statistical machine learning tools (Support Vector Machines). The detectors showed some promising improvements achieved by adding background noises to the training data, comprised of clean sound effects that are quite different from the real audio events in real life videos and movies. A graphical interface prototype is also presented, that allows browsing a movie by its content and provides an audio event description with time codes.
Keywords :
audio signal processing; cinematography; multimedia communication; statistical analysis; support vector machines; video retrieval; video signal processing; AED framework; World Wide Web; audio access; audio event description; audio event detection; clean sound effect; continuous background noise; graphical interface prototype; movie browsing; multimedia content; multimedia document archive; multimedia document index; multimedia document search; overlapping audio event; real audio event; real life movies; real life video; sound effect corpus; statistical machine learning tool; support vector machine; time code; video audio content; video browsing; Detectors; Event detection; Feature extraction; Motion pictures; Noise measurement; Speech; Videos;
Conference_Titel :
EUROCON - International Conference on Computer as a Tool (EUROCON), 2011 IEEE
Conference_Location :
Lisbon
Print_ISBN :
978-1-4244-7486-8
DOI :
10.1109/EUROCON.2011.5929358