Title :
Learning contextual relevance of audio segments using discriminative models over AUD sequences
Author :
Chaudhuri, Sourish ; Raj, Bhiksha
Author_Institution :
Language Technol. Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
Effective retrieval of multimodal data involves performing accurate segmentation and analysis of such data. With easy access to a number of audio and video sharing platforms online, user-generated content with considerably less than ideal recording conditions has increased rapidly. One major issue with such content is the presence of semantically irrelevant segments in such recordings. This leads to the presence of considerable contextual noise in such recordings that makes analysis difficult. In this paper, we present a discriminative large-margin based approach that uses annotated data to understand which parts of the audio are relevant (while noting that the notion of relevance could be extremely subjective and potentially challenging to define), and can automatically extract such segments from new audio.
Keywords :
audio signal processing; content-based retrieval; data analysis; image segmentation; information retrieval; learning (artificial intelligence); sequences; speech recognition; AUD sequences; audio segment selection; audio sharing; contextual noise; data analysis; data segmentation; discriminative models; learning contextual relevance; multimodal data retrieval; semantic; video sharing; Acoustics; Context; Data mining; Feature extraction; Sports equipment; Training; Training data; AUDs; audio segment selection; large margin discriminative training;
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2011 IEEE Workshop on
Conference_Location :
New Paltz, NY
Print_ISBN :
978-1-4577-0692-9
Electronic_ISBN :
1931-1168
DOI :
10.1109/ASPAA.2011.6082335