• DocumentCode
    2011626
  • Title

    Effect of "Ground Truth" on Image Binarization

  • Author

    Smith, Elisa H Barney ; An, Chang

  • Author_Institution
    Boise State Univ., Boise, ID, USA
  • fYear
    2012
  • fDate
    27-29 March 2012
  • Firstpage
    250
  • Lastpage
    254
  • Abstract
    Image binarization has a large effect on the rest of the document image analysis processes in character recognition. Algorithm development is still a major focus of research. Evaluation of image binarization has been done by comparison of the result of OCR systems on images binarized by different methods. That has been criticized in that the binarization alone is not evaluated, but rather how it interacts with the downstream processes. Recently pixel accurate "ground truth" images have been introduced for use in binarization algorithm evaluation. This has been shown to be open to interpretation. The choice of binarization ground truth affects the binarization algorithm design, either directly if design is by automated algorithm trying to match the provided ground truth, or indirectly if human designers adjust their designs to perform better on the provided data. Three variations in pixel accurate ground truth were used to train a binarization classifier. The performance can vary significantly depending on choice of ground truth, which can influence binarization design choices.
  • Keywords
    document image processing; image classification; optical character recognition; OCR system; algorithm development; automated algorithm; binarization algorithm evaluation; binarization classifier; binarization design choices; character recognition; document image analysis; downstream process; human designer; image binarization; optical character recognition; performance evaluation; pixel accurate ground truth images; Algorithm design and analysis; Humans; Image segmentation; Measurement; Optical character recognition software; Text analysis; Training; Degraded document images; Ground Truthing; Image Binarization; Performance Evaluation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
  • Conference_Location
    Gold Cost, QLD
  • Print_ISBN
    978-1-4673-0868-7
  • Type

    conf

  • DOI
    10.1109/DAS.2012.32
  • Filename
    6195373