• DocumentCode
    3142690
  • Title

    Automatic separation of machine-printed and hand-written text lines

  • Author

    Pal, U. ; Chaudhuri, B.B.

  • Author_Institution
    Comput. Vision & Pattern Recognition Unit, Indian Stat. Inst., Calcutta, India
  • fYear
    1999
  • fDate
    20-22 Sep 1999
  • Firstpage
    645
  • Lastpage
    648
  • Abstract
    There are many types of documents where machine-printed and hand-written texts appear intermixed. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, it is necessary to separate these two types of text before feeding them to the respective OCR systems. In this paper, we present such a scheme for both Bangla and Devnagari characters. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of about 98.3%
  • Keywords
    character sets; document image processing; image classification; image segmentation; optical character recognition; Bangla characters; Devnagari characters; OCR systems; accuracy; automatic text line separation; classification scheme; handwritten text lines; machine-printed text lines; optical character recognition; statistical features; structural features; Computer vision; Data mining; Handwriting recognition; Histograms; Image segmentation; Natural languages; Neural networks; Optical character recognition software; Pattern recognition; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    0-7695-0318-7
  • Type

    conf

  • DOI
    10.1109/ICDAR.1999.791870
  • Filename
    791870