DocumentCode :
3489522
Title :
A Fast Word Retrieval Technique Based on Kernelized Locality Sensitive Hashing
Author :
Mondal, Tanmoy ; Ragot, N. ; Ramel, Jean-Yves ; Pal, Umapada
Author_Institution :
Lab. d´Inf., Univ. Francois Rabelais Tours, Tours, France
fYear :
2013
fDate :
25-28 Aug. 2013
Firstpage :
1195
Lastpage :
1199
Abstract :
In this paper, we have presented a new and faster word retrieval approach, which is able to deal with heterogeneous document image collections. A certain amount of image features (statistical and Gabor Wavelet) are extracted, which inherently represent word´s images. These features are used for generating hash table for fast retrieval of similar image from a very large image dataset. The decomposition and embedding of high-dimensional features and complex distance functions into a low-dimensional Hamming space helps to efficiently search items. However, existing methods do not apply for high-dimensional kernelized data when the underlying features´ embedding for the kernel is unknown. The generalization of locality sensitive hashing (LSH) for arbitrary kernel is presented in the paper. The proposed algorithm provides sub-linear time similarity search and works for a wide class of similarity functions.
Keywords :
document image processing; feature extraction; image retrieval; statistical analysis; wavelet transforms; word processing; Gabor wavelet feature extraction; arbitrary kernel; complex distance function decomposition; complex distance function embedding; generalized LSH; generalized locality sensitive hashing; hash table generation; heterogeneous document image collections; high-dimensional feature decomposition; high-dimensional feature embedding; image feature extraction; kernelized locality sensitive hashing; low-dimensional Hamming space; similar image retrieval; similarity functions; statistical feature extraction; sublinear time similarity search; very large image dataset; word image representation; word retrieval technique; Accuracy; Databases; Feature extraction; Kernel; Pattern recognition; Training; Vectors; Fast word spotting; Gabor wavelet; Kernelized locality sensitive hashing (KLSH);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
ISSN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2013.242
Filename :
6628803
Link To Document :
بازگشت