DocumentCode
3695231
Title
A segmentation-free approach for printed Devanagari script recognition
Author
Tushar Karayil;Adnan Ul-Hasan;Thomas M. Breuel
Author_Institution
Department of Computer Science, University of Kaiserslautern, Germany
fYear
2015
Firstpage
946
Lastpage
950
Abstract
Long Short-Term Memory (LSTM) networks are a suitable candidate for segmentation-free Optical Character Recognition (OCR) tasks due to their good context-aware processing. In this paper, we report the results of applying LSTM networks to Devanagari script, where each consonant-consonant conjuncts and consonant-vowel combinations take different forms based on their position in the word. We also introduce a new database, Deva-DB, of Devanagari script (free of cost) to aid the research towards a robust Devanagari OCR system. On this database, LSTM-based OCRopus system yields error rates ranging from 1.2% to 9.0% depending upon the complexity of the training and test data. Comparison with open-source Tesseract system is also presented for the same database.
Keywords
"Optical imaging","Robustness","Proteins","Periodic structures","Computer aided software engineering","Adaptive optics","Integrated optics"
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
Type
conf
DOI
10.1109/ICDAR.2015.7333901
Filename
7333901
Link To Document