Title :
Detecting person presence in TV shows with linguistic and structural features
Author :
Bechet, Frederic ; Favre, Benoit ; Damnati, Geraldine
Author_Institution :
LIF, Aix Marseille Univ., Marseille, France
Abstract :
Person detection and recognition in videos is a hard problem due to the intrinsic ambiguities of the sound and image channels and their interaction. Whatever method is used to extract person hypotheses from the audio or the image channels, person recognition in videos relies on a multimodal decision process that merges the different hypotheses produced in order to decide, for each frame, who is present in the video at the audio level, at the image level or at the content level (person mention in speech or inserted text boxes). In this framework the focus of this paper is to produce a list of person presence hypotheses from the audio channel of a video document only, to be used in addition to person presence detected at the image level by a multimodal fusion process. In this study we focus on the audio channel only, using two kinds of features: linguistic features corresponding to the way a person is mentioned by a speaker; structural features corresponding to the context of occurrence of a name in a show. We show that both sets of features are complementary and that good results can be achieved on a TV show corpus annotated with person presence labels.
Keywords :
feature extraction; image fusion; image recognition; natural language processing; speech recognition; video signal processing; TV shows; audio channel; image channel ambiguity; linguistic features; multimodal decision process; multimodal fusion process; person hypotheses extraction; person presence detection; person recognition; sound channel ambiguity; structural features; video document; Face; Feature extraction; Pragmatics; Speech; Speech recognition; TV; Videos; Boosting; Identification of persons; Named Entity; Spoken Language Understanding;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6289062