DocumentCode
967828
Title
Design of Multimodal Dissimilarity Spaces for Retrieval of Video Documents
Author
Bruno, Eric ; Moenne-Loccoz, Nicolas ; Marchand-Maillet, Stéphane
Author_Institution
Comput. Vision & Multimedia Lab., Univ. of Geneva, Geneva
Volume
30
Issue
9
fYear
2008
Firstpage
1520
Lastpage
1533
Abstract
The paper proposes a novel representation space for multimodal information, enabling fast and efficient retrieval of video data. We suggest describing the documents not directly by selected multimodal features (audio, visual, or text) but rather by considering cross-document similarities relative to their multimodal characteristics. This idea leads us to propose a particular form of dissimilarity space that is adapted to the asymmetric classification problem and, in turn, to the query-by-example and relevance feedback paradigm, widely used in information retrieval. Based on the proposed dissimilarity space, we then define various strategies to fuse modalities through a kernel-based learning approach. The problem of automatic kernel setting to adapt the learning process to the queries is also discussed. The properties of our strategies are studied and validated on artificial data. In a second phase, a large annotated video corpus (i.e., TRECVID ´05) indexed by visual, audio, and text features is considered to evaluate the overall performance of the dissimilarity space and fusion strategies. The obtained results confirm the validity of the proposed approach for the representation and retrieval of multimodal information in a real-time framework.
Keywords
document handling; learning (artificial intelligence); query processing; relevance feedback; video retrieval; annotated video corpus; asymmetric classification problem; automatic kernel setting; cross-document similarity; fusion strategy; information retrieval; kernel-based learning approach; multimodal dissimilarity spaces; multimodal features; query-by-example; relevance feedback; video document retrieval; Concept learning; Image/video retrieval; Machine learning; Multimedia databases; Algorithms; Artificial Intelligence; Databases, Factual; Documentation; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Pattern Recognition, Automated; Radiology Information Systems; Reproducibility of Results; Sensitivity and Specificity; Subtraction Technique; Video Recording;
fLanguage
English
Journal_Title
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher
ieee
ISSN
0162-8828
Type
jour
DOI
10.1109/TPAMI.2007.70801
Filename
4378388
Link To Document