DocumentCode
3672424
Title
Deep correlation for matching images and text
Author
Fei Yan;Krystian Mikolajczyk
Author_Institution
Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, United Kingdom, GU2 7XH
fYear
2015
fDate
6/1/2015 12:00:00 AM
Firstpage
3441
Lastpage
3450
Abstract
This paper addresses the problem of matching images and captions in a joint latent space learnt with deep canonical correlation analysis (DCCA). The image and caption data are represented by the outputs of the vision and text based deep neural networks. The high dimensionality of the features presents a great challenge in terms of memory and speed complexity when used in DCCA framework. We address these problems by a GPU implementation and propose methods to deal with overfitting. This makes it possible to evaluate DCCA approach on popular caption-image matching benchmarks. We compare our approach to other recently proposed techniques and present state of the art results on three datasets.
Keywords
"Correlation","Yttrium","Graphics processing units","Protocols","Training","Libraries","Visualization"
Publisher
ieee
Conference_Titel
Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
Electronic_ISBN
1063-6919
Type
conf
DOI
10.1109/CVPR.2015.7298966
Filename
7298966
Link To Document