• DocumentCode
    2319856
  • Title

    An extended method for recognition of broken typewritten characters special reference to tamil script

  • Author

    Abubacker, Nirase Fathima ; Gandhi, Raman Indra

  • Author_Institution
    Sch. of Inf. Technol., City Univ. Coll. of Sci. & Technol., Kuala Lumpur, Malaysia
  • fYear
    2011
  • fDate
    25-28 Sept. 2011
  • Firstpage
    214
  • Lastpage
    219
  • Abstract
    Preparing clean and clear images for the recognition engines is often taken for granted as a trivial task that requires little attention. Most of the existing OCRs have been designed in such a way that which correctly identify fine printed documents in all scripts. The performance of standard machine printed OCR system works fails, if it is tested on documents with distorted characters. This paper presents an approach to overcome the difficulties presented in such distorted type written documents especially with broken characters. As a first step, isolation of character is forwarded using character position location and character localization and enclosing it in a matrix which will be analyzing and repairing in the later part of our study. An attempt is incorporated using shape and line tracing method for recognition of distorted broken characters and then it is fine tuned by lexical knowledge.
  • Keywords
    document image processing; natural language processing; optical character recognition; OCR; broken typewritten characters special reference; character localization; character position location; distorted characters; extended method; lexical knowledge; optical character recognition; printed documents; tamil script; Accuracy; Character recognition; Conferences; Feature extraction; Open systems; Shape; Support vector machine classification; Broken Tamil; Distorted Characters; Line Tracing; Localization; Shape Tracing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Open Systems (ICOS), 2011 IEEE Conference on
  • Conference_Location
    Langkawi
  • Print_ISBN
    978-1-61284-931-7
  • Type

    conf

  • DOI
    10.1109/ICOS.2011.6079265
  • Filename
    6079265