• DocumentCode
    2060440
  • Title

    Adaptive document binarization

  • Author

    Sauvola, Jaakko ; Seppänen, Tapio ; Haapakoski, Sami ; Pietikäinen, Matti

  • Author_Institution
    Machine Vision & Media Process. Group, Oulu Univ., Finland
  • Volume
    1
  • fYear
    1997
  • fDate
    18-20 Aug 1997
  • Firstpage
    147
  • Abstract
    A new method is presented for adaptive document image binarization, where the page is considered as a collection of subcomponents such as text, background and picture. The problems caused by noise, illumination and many source type related degradations are addressed. The algorithm uses document characteristics to determine (surface) attributes, often used in document segmentation. Using characteristic analysis, two new algorithms are applied to determine a local threshold for each pixel. An algorithm based on soft decision control is used for thresholding the background and picture regions. An approach utilizing local mean and variance of gray values is applied to textual regions. Tests were performed with images including different types of document components and degradations. The results show that the method adapts and performs well in each case
  • Keywords
    document image processing; image segmentation; lighting; noise; optical character recognition; performance evaluation; adaptive document image binarization; background; characteristic analysis; document characteristics; document segmentation; gray values; illumination; image degradations; image thresholding; local mean; local threshold; noise; page; performance; picture; picture regions; pixel; soft decision control; text; textual regions; variance; Algorithm design and analysis; Degradation; Histograms; Image analysis; Image segmentation; Lighting; Machine vision; Pixel; Testing; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
  • Conference_Location
    Ulm
  • Print_ISBN
    0-8186-7898-4
  • Type

    conf

  • DOI
    10.1109/ICDAR.1997.619831
  • Filename
    619831