DocumentCode
183205
Title
A Simple and Fast Word Spotting Method
Author
Kovalchuk, Alon ; Wolf, Lars ; Dershowitz, Nachum
Author_Institution
Blavatnik Sch. of Comput. Sci., Tel Aviv Univ., Tel Aviv, Israel
fYear
2014
fDate
1-4 Sept. 2014
Firstpage
3
Lastpage
8
Abstract
A simple and efficient pipeline for word spotting in handwritten documents is proposed. The method allows for extremely rapid querying, while still maintaining high accuracy. The dataset images that are to be queried are preprocessed by a simple binarization operation, followed by the extraction of multiple overlapping candidate targets. Each binary target, as well as the binarized query, is resized to fit a fixed-size rectangle and represented by conventional image descriptors. Then, a cosine similarity operator -- followed by maximum pooling over random groups -- is used to represent each target or query as a concise 250D vector. Retrieval is performed in a fraction of a second by nearest-neighbor search within that space, followed by a simple suppression of extra overlapping candidates.
Keywords
document image processing; handwritten character recognition; image representation; image retrieval; vectors; 250D vector; binarization operation; conventional image descriptor representation; cosine similarity operator; dataset image querying; handwritten documents; image retrieval; nearest-neighbor search; word spotting method; Accuracy; Benchmark testing; Image segmentation; Pipelines; Standards; Training; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location
Heraklion
ISSN
2167-6445
Print_ISBN
978-1-4799-4335-7
Type
conf
DOI
10.1109/ICFHR.2014.9
Filename
6980988
Link To Document