Title :
Document Specific Sparse Coding for Word Retrieval
Author :
Shekhar, Ravi ; Jawahar, C.V.
Author_Institution :
Centre for Visual Inf. Technol., Int. Inst. of Inf. Technol., Hyderabad, India
Abstract :
Bag of words (BoW) based retrieval is an efficient method to compare the visual similarity between two images. Recognition free methods based on BoW have shown to outperform OCR based methods. We further improve the performance by defining a document specific sparse coding scheme for representing visual words (interest points) in document images. Our method is motivated by the successful use of sparsity in signal representation by exploiting the neighbourhood properties. In addition to providing insights into the design of the coding scheme, we also verify the method on two data sets and compare with the recent methods. We have also developed text query based search solution, and we report performance comparable to image based search.
Keywords :
document image processing; image coding; image representation; information retrieval; BoW based retrieval; bag of words based retrieval; document images; document specific sparse coding; recognition free methods; signal representation; text query based search solution; visual similarity; Encoding; Feature extraction; Image coding; Quantization (signal); Vectors; Visualization; Vocabulary; Bag of Words; Document Image Retrieval; Sparse Coding;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/ICDAR.2013.132