DocumentCode
2011626
Title
Effect of "Ground Truth" on Image Binarization
Author
Smith, Elisa H Barney ; An, Chang
Author_Institution
Boise State Univ., Boise, ID, USA
fYear
2012
fDate
27-29 March 2012
Firstpage
250
Lastpage
254
Abstract
Image binarization has a large effect on the rest of the document image analysis processes in character recognition. Algorithm development is still a major focus of research. Evaluation of image binarization has been done by comparison of the result of OCR systems on images binarized by different methods. That has been criticized in that the binarization alone is not evaluated, but rather how it interacts with the downstream processes. Recently pixel accurate "ground truth" images have been introduced for use in binarization algorithm evaluation. This has been shown to be open to interpretation. The choice of binarization ground truth affects the binarization algorithm design, either directly if design is by automated algorithm trying to match the provided ground truth, or indirectly if human designers adjust their designs to perform better on the provided data. Three variations in pixel accurate ground truth were used to train a binarization classifier. The performance can vary significantly depending on choice of ground truth, which can influence binarization design choices.
Keywords
document image processing; image classification; optical character recognition; OCR system; algorithm development; automated algorithm; binarization algorithm evaluation; binarization classifier; binarization design choices; character recognition; document image analysis; downstream process; human designer; image binarization; optical character recognition; performance evaluation; pixel accurate ground truth images; Algorithm design and analysis; Humans; Image segmentation; Measurement; Optical character recognition software; Text analysis; Training; Degraded document images; Ground Truthing; Image Binarization; Performance Evaluation;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
Conference_Location
Gold Cost, QLD
Print_ISBN
978-1-4673-0868-7
Type
conf
DOI
10.1109/DAS.2012.32
Filename
6195373
Link To Document