DocumentCode :
3486422
Title :
Efficient Word Image Retrieval Using Earth Movers Distance Embedded to Wavelets Coefficients Domain
Author :
Saabni, Raid
Author_Institution :
Dept. of Comput. Sci., Triangle R&D Center, Tel-Aviv, Israel
fYear :
2013
fDate :
25-28 Aug. 2013
Firstpage :
314
Lastpage :
318
Abstract :
In this paper we use the Earth Movers Distance (EMD) algorithm to measure similarity between shapes for recognizing and searching Arabic words. We have used the Shape Context and the Angular Radial Partitioning descriptors to evaluate matching and recognizing with EMD. Based on the encouraging results of high accuracy and recall, we follow the low-distortion embedding of the Earth Mover´s Distance to map the shapes in the database under the EMD distance, into a normed space of wavelet coefficients as differences of coefficients histograms. The approximate k-nearest neighbors in the database of the embedded shapes are retrieved in sub linear time using a Locality-Sensitive Hashing (LSH) and generate a short list of candidates. This short list of candidates is used in a filter and refine strategy and the exact results are achieved using the original EMD on this short list. We demonstrate our method on the MNIST dataset and the freely available Arabic Printed Text Image (APTI) database. Our method achieves a speedup of 4 orders of magnitude over the exact method, at the cost of only a 2.4% reduction in accuracy.
Keywords :
document image processing; filtering theory; image matching; image retrieval; shape recognition; text analysis; wavelet transforms; APTI database; Arabic Printed Text Image database; Arabic word recognition; Arabic word searching; EMD algorithm; LSH; MNIST dataset; angular radial partitioning descriptors; approximate k-nearest neighbors; coefficients histograms; earth movers distance algorithm; filter strategy; locality-sensitive hashing; matching evaluation; refine strategy; shape context descriptors; shape similarity measurement; wavelet coefficients domain; word image retrieval; Context; Databases; Earth; Handwriting recognition; Histograms; Measurement; Shape; Arabic Text Recognition; Earth Movers Distance; Image retrieval; Shape Context; Word matching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
ISSN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2013.70
Filename :
6628635
Link To Document :
بازگشت