DocumentCode
417716
Title
Exploiting multiple modalities for interactive video retrieval
Author
Christel, Michael G. ; Huang, Chang ; Moraveji, Neema ; Papernick, Norman
Author_Institution
Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume
3
fYear
2004
fDate
17-21 May 2004
Abstract
Aural and visual cues can be automatically extracted from video and used to index its contents. The paper explores the relative merits of the cues extracted from the different modalities for locating relevant shots in video, specifically reporting on the indexing and interface strategies used to retrieve information from the Video TREC 2002 and 2003 data sets, and the evaluation of the interactive search runs. For the documentary and news material in these sets, automated speech recognition produces rich textual descriptions derived from the narrative, with visual descriptions and depictions offering additional browsing functionality. Through speech and visual processing, storyboard interfaces with query-based filtering provide an effective interactive retrieval interface. Examples drawn from the Video TREC 2002 and 2003 search topics and results using these topics illustrate the utility of multiple-document storyboards and other interfaces incorporating the results of multimodal processing.
Keywords
content-based retrieval; feature extraction; image classification; image retrieval; query formulation; speech processing; speech recognition; text analysis; video databases; video signal processing; aural cue extraction; automated speech recognition; documentary material; indexing strategies; information retrieval; interactive retrieval interface; interactive search runs; interactive video retrieval; multimodal processing; multiple modalities; news material; query-based filtering; speech processing; storyboard interfaces; textual descriptions; visual cue extraction; visual descriptions; visual processing; Automatic speech recognition; Broadcasting; Data mining; Filtering; Indexing; Information retrieval; NIST; Speech processing; Speech recognition; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326724
Filename
1326724
Link To Document