Title :
Ranking and retrieval of image sequences from multiple paragraph queries
Author :
Gunhee Kim; Seungwhan Moon;Leonid Sigal
Author_Institution :
Seoul National University, Korea
fDate :
6/1/2015 12:00:00 AM
Abstract :
We propose a method to rank and retrieve image sequences from a natural language text query, consisting of multiple sentences or paragraphs. One of the method´s key applications is to visualize visitors´ text-only reviews on TRIPADVISOR or YELP, by automatically retrieving the most illustrative image sequences. While most previous work has dealt with the relations between a natural language sentence and an image or a video, our work extends to the relations between paragraphs and image sequences. Our approach leverages the vast user-generated resource of blog posts and photo streams on the Web. We use blog posts as text-image parallel training data that co-locate informative text with representative images that are carefully selected by users. We exploit large-scale photo streams to augment the image samples for retrieval. We design a latent structural SVM framework to learn the semantic relevance relations between text and image sequences. We present both quantitative and qualitative results on the newly created DISNEYLAND dataset.
Keywords :
"Image segmentation","Blogs","Image sequences","Streaming media","Semantics","Training","Natural languages"
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
Electronic_ISBN :
1063-6919
DOI :
10.1109/CVPR.2015.7298810