• DocumentCode
    2825587
  • Title

    Tools for Developing OCRs for Indian Scripts

  • Author

    Kumar, M N S S K Pavan ; Kiran, S S Ravi ; Nayani, Abhishek ; Jawahar, C.V. ; Narayanan, P.J.

  • Author_Institution
    International Institute of Information Technology
  • Volume
    3
  • fYear
    2003
  • fDate
    16-22 June 2003
  • Firstpage
    33
  • Lastpage
    33
  • Abstract
    Development of OCRs for Indian script is an active area of research today. Indian scripts present great challenges to an OCR designer due to the large number of letters in the alphabet, the sophisticated ways in which they combine, and the complicated graphemes they result in. The problem is compounded by the unstructured manner in which popular fonts are designed. There is a lot of common structure in the different Indian scripts. In this paper, we argue that a number of automatic and semi-automatic tools can ease the development of recognizers for new font styles and new scripts. We discuss briefly three such tools we developed and show how they have helped build new OCRs. An integrated approach to the design of OCRs for all Indian scripts has great benefits. We are building OCRs for many Indian languages following this approach as part of a system to provide tools to create content in them.
  • Keywords
    Buildings; Code standards; Electronics industry; Industrial electronics; Information technology; Optical character recognition software; Optical materials; Publishing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition Workshop, 2003. CVPRW '03. Conference on
  • Conference_Location
    Madison, Wisconsin, USA
  • ISSN
    1063-6919
  • Print_ISBN
    0-7695-1900-8
  • Type

    conf

  • DOI
    10.1109/CVPRW.2003.10023
  • Filename
    4624291