• DocumentCode
    3695231
  • Title

    A segmentation-free approach for printed Devanagari script recognition

  • Author

    Tushar Karayil;Adnan Ul-Hasan;Thomas M. Breuel

  • Author_Institution
    Department of Computer Science, University of Kaiserslautern, Germany
  • fYear
    2015
  • Firstpage
    946
  • Lastpage
    950
  • Abstract
    Long Short-Term Memory (LSTM) networks are a suitable candidate for segmentation-free Optical Character Recognition (OCR) tasks due to their good context-aware processing. In this paper, we report the results of applying LSTM networks to Devanagari script, where each consonant-consonant conjuncts and consonant-vowel combinations take different forms based on their position in the word. We also introduce a new database, Deva-DB, of Devanagari script (free of cost) to aid the research towards a robust Devanagari OCR system. On this database, LSTM-based OCRopus system yields error rates ranging from 1.2% to 9.0% depending upon the complexity of the training and test data. Comparison with open-source Tesseract system is also presented for the same database.
  • Keywords
    "Optical imaging","Robustness","Proteins","Periodic structures","Computer aided software engineering","Adaptive optics","Integrated optics"
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICDAR.2015.7333901
  • Filename
    7333901