DocumentCode :
178436
Title :
Word Spotting in Bangla and English Graphical Documents
Author :
Tarafdar, A. ; Pal, U. ; Ramel, J.-Y. ; Ragot, N. ; Chaudhuri, B.B.
Author_Institution :
CVPR Unit, Indian Stat. Inst., Kolkata, India
fYear :
2014
fDate :
24-28 Aug. 2014
Firstpage :
3044
Lastpage :
3049
Abstract :
Word spotting in graphical documents is a very challenging task. With an increase usage of electronic media, we are in a need of searching objects in graphical documents by some labeled text. To address such scenarios we propose a word spotting system dedicated to graphical documents with Bangla and English scripts. In our proposed system, first text-graphics layers are separated using Gabor filter. In the text layer, character segmentation approach is applied using water reservoir based method to extract each character from the document. Then recognition of these isolated characters is done using rotation invariant feature, coupled with SVM classifier. Well recognized characters are then grouped based on their sizes. Initial spotting is started to find a query word among those groups of characters. In case if the system could spot a word partially due to any noise, SIFT is applied to identify missing portion of that partial spotting. Experimental results on English and Bangla script document images show that the method is feasible to spot a location in text labeled graphical documents.
Keywords :
Gabor filters; document image processing; image classification; image retrieval; image segmentation; natural language processing; optical character recognition; support vector machines; text analysis; transforms; Bangla graphical documents; Bangla scripts; English graphical documents; English scripts; Gabor filter; SIFT; SVM classifier; character segmentation approach; electronic media usage; isolated character recognition; object searching; query word; rotation invariant feature; scale invariant feature transform; support vector machine; text labeled graphical documents; text-graphics layers; word spotting system; Character recognition; Feature extraction; Gabor filters; Graphics; Reservoirs; Support vector machines; Clustering; Document Image Analysis; Gabor Filter; Graphical documents; Information Retrieval; SIFT feature; Water Reservoir Principle; Word Spotting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2014 22nd International Conference on
Conference_Location :
Stockholm
ISSN :
1051-4651
Type :
conf
DOI :
10.1109/ICPR.2014.525
Filename :
6977237
Link To Document :
بازگشت