• DocumentCode
    2146752
  • Title

    Image Enhancement for Degraded Binary Document Images

  • Author

    Shi, Zhixin ; Setlur, Srirangaraj ; Govindaraju, Venu

  • Author_Institution
    Dept. of Comput. Sci. & Eng., State Univ. of New York at Buffalo, Buffalo, NY, USA
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    895
  • Lastpage
    899
  • Abstract
    This paper presents a novel set of image enhancement algorithms for binary images of poorly scanned real world page documents. Problems that are targeted by the methods described include large blobs or clutter noise, salt-and-pepper noise and detection and removal of non-text objects such as form lines or rule-lines. The algorithms described are shown to be very effective in removing clutter noise and pepper noise as well as form lines and rule-lines. A region growing algorithm is also described to enhance the quality of the text and to fix the problems arising from the salt noise which leaves holes in the text and creates broken strokes. The methods were tested on 204 images from the challenge set of the DARPA MADCAT Arabic handwritten document image data. The results indicate that the methods described are robust and are capable of significantly improving the image quality for downstream OCR systems.
  • Keywords
    document image processing; handwritten character recognition; image denoising; image enhancement; optical character recognition; DARPA MADCAT Arabic handwritten document image data; binary images; clutter noise removal; degraded binary document images; downstream OCR systems; image enhancement; image quality; nontext object detection; nontext object removal; pepper noise removal; salt noise which; salt-and-pepper noise; text quality enhancement; Algorithm design and analysis; Clutter; Image edge detection; Image enhancement; Noise; Optical character recognition software; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2011 International Conference on
  • Conference_Location
    Beijing
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4577-1350-7
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2011.305
  • Filename
    6065440