Title :
Query by string word spotting based on character bi-gram indexing
Author :
Suman K. Ghosh;Ernest Valveny
Author_Institution :
Computer Vision Center, Dept. Ci`encies de la Computació
Abstract :
In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representation that projects images and strings into a common attribute space based on a Pyramidal Histogram of Characters (PHOC). These attribute models are learned using linear SVMs over the Fisher Vector [8] representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi-gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets.
Keywords :
Image segmentation
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
DOI :
10.1109/ICDAR.2015.7333888