• DocumentCode
    2763528
  • Title

    Segmentation of low-quality typewritten digits

  • Author

    Rodriguez, C. ; Muguerza, J. ; Navarro, M. ; Zarate, A. ; Martin, J.I. ; Perez, J.M.

  • Author_Institution
    Comput. Archit. & Technol. Dept., Basque Country Univ., Donostia, Spain
  • Volume
    2
  • fYear
    1998
  • fDate
    16-20 Aug 1998
  • Firstpage
    1106
  • Abstract
    This work addresses the segmentation of numeric fields in forms presenting blurring, breaks and touching in digits. In an OCR system, the segmentation phase plays a determinant role in the global accuracy of the system. Segmentation is basically addressed from two approaches: (a) as an isolated phase in the OCR process, and (b) as interacting with the recognition of the segmented item. In this work, we have considered the first one in order to develop a robust new cost function combining vertical projection, Tsujimoto metric (1991) and background information. Unlike other techniques reported in the literature, ours obtains a near-optimum number of break points in fields containing broken, blurred and touching characters, leading to high accuracy in the global OCR system. Our experiments with a sample including about 11283 numeric fields in 144 forms (more than 50000 digits of that kind) show that 99.74% of fields have been correctly segmented. The new cost function only made 50 errors
  • Keywords
    image segmentation; optical character recognition; OCR system; Tsujimoto metric; background information; blurring; low-quality typewritten digit segmentation; vertical projection; Character recognition; Computer architecture; Cost function; Electronic mail; Feature extraction; Image quality; Image segmentation; Optical character recognition software; Read only memory; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on
  • Conference_Location
    Brisbane, Qld.
  • ISSN
    1051-4651
  • Print_ISBN
    0-8186-8512-3
  • Type

    conf

  • DOI
    10.1109/ICPR.1998.711887
  • Filename
    711887