Title :
Modeling image degradations for improving OCR
Author :
Barney Smith, Elisa H.
Author_Institution :
Electr. & Comput. Eng., Boise State Univ., Boise, ID, USA
Abstract :
Clean documents are relatively easy to recognize. However, when digitizing collections of documents, the clean ones are rarely the documents that are encountered. The processes of printing and scanning documents introduce image degradations that interfere with the segmentation and recognition processes. Mathematical models of the degradation processes are presented. From these the types of degradations that are seen can be quantitatively and qualitatively described. Included in the discussion are sampling, edge spread, corner erosion, and edge noise. The relationship between these degradations and common OCR errors is described. By considering the degradation model, a theoretical foundation is available to improve the document recognition process.
Keywords :
edge detection; image denoising; image sampling; optical character recognition; OCR; corner erosion; document recognition; edge noise; edge spread; image degradations; image recognition; image sampling; image segmentation; Additive noise; Degradation; Europe; Image edge detection; Optical character recognition software;
Conference_Titel :
Signal Processing Conference, 2008 16th European
Conference_Location :
Lausanne