Embedding Multi-Order Spatial Clues for Scalable Visual Matching and Retrieval

Author

Shiliang Zhang ; Qi Tian ; Qingming Huang ; Yong Rui

Author_Institution

Dept. of Comput. Sci., Univ. of Texas at San Antonio, San Antonio, TX, USA

Volume

4

Issue

1

fYear

2014

fDate

Mar-14

Firstpage

130

Lastpage

141

Abstract

Matching duplicate visual contents among images serves as the basis of many vision tasks. Researchers have proposed different local descriptors for image matching, e.g., floating point descriptors like SIFT, SURF, and binary descriptors like ORB and BRIEF. These descriptors either suffer from relatively expensive computation or limited robustness due to the compact binary representation. This paper studies how to improve the matching efficiency and accuracy of floating points descriptors and the matching accuracy of binary descriptors. To achieve this goal, we embed the spatial clues among local descriptors to a novel local feature, i.e., multi-order visual phrase which contains two complementary clues: 1) the center visual clues extracted at each image keypoint and 2) the neighbor visual and spatial clues of multiple nearby keypoints. Different from existing visual phrase features, two multi-order visual phrases are flexibly matched by first matching their center visual clues, then estimating a match confidence by checking the spatial and visual consistency of their neighbor keypoints. Therefore, multi-order visual phrase does not scarify the repeatability of classic visual word and is more robust to the quantization error than existing visual phrase features. We extract multi-order visual phrases from both SIFT and ORB and test them in image matching and retrieval tasks on UKbench, Oxford5K, and 1 million distractor images collected from Flickr. Comparisons with recent retrieval approaches clearly demonstrate the competitive accuracy and significantly better efficiency of our approaches.

Keywords

image matching; image representation; image retrieval; ORB; SIFT; Scalable Visual Matching; binary descriptors; compact binary representation; duplicate visual content matching; floating point descriptors; image matching; image retrieval; local descriptors; multi order spatial clues; novel local feature; spatial clues; visual phrase features; Feature extraction; Image matching; Indexes; Quantization (signal); Robustness; Visualization; Vocabulary; Image local descriptor; image matching; large-scale image retrieval; visual vocabulary;

fLanguage

English

Journal_Title

Emerging and Selected Topics in Circuits and Systems, IEEE Journal on

Publisher

ieee

ISSN

2156-3357

Type

jour

DOI

10.1109/JETCAS.2014.2298272

Filename

6720206