Title :
OCR voting methods for recognizing low contrast printed documents
Author_Institution :
ScanSoft-Recognita, Inc., Burlington, MA
Abstract :
Modern adaptive thresholding algorithms do their best to provide good quality binarized images. Unfortunately, it\´s hard to find a good compromise between the amount of background noise in the binary result and the amount of breaks or missing parts in the shape of characters if the original grey image has low contrast. In this paper, we describe some voting methods starting from an external "black box" voter, to a more deeply integrated "shape" voter that can be used to generate even better recognition results by running a voting OCR engine on two, differently thresholded images
Keywords :
document image processing; optical character recognition; OCR voting methods; adaptive thresholding algorithms; background noise; external black box voter; good quality binarized images; integrated shape voter; low contrast grey image; low contrast printed document recognition; optical character recognition; thresholded images; voting OCR engine; Background noise; Character recognition; Engines; Filtering algorithms; Image recognition; Noise shaping; Optical character recognition software; Shape; Text recognition; Voting;
Conference_Titel :
Document Image Analysis for Libraries, 2006. DIAL '06. Second International Conference on
Conference_Location :
Lyon
Print_ISBN :
0-7695-2531-8
DOI :
10.1109/DIAL.2006.28