DocumentCode
3023181
Title
Algorithms for postprocessing OCR results with visual inter-word constraints
Author
Hong, Tao ; Hull, Jonathan J.
Author_Institution
Center of Excellence for Document Anal. & Recognition, State Univ. of New York, Buffalo, NY, USA
Volume
3
fYear
1995
fDate
23-26 Oct 1995
Firstpage
312
Abstract
Algorithms are presented that determine the visual relationships between word images in a document. These include instances of common word images and common substrings that occur often in English language text images. This information is then used to improve the performance of a commercial optical character recognition (OCR) algorithm. The algorithms presented calculate clusters of equivalent word images as well as common initial and final substrings. Experimental results are presented that show a 40% reduction in word level error rate is achieved on a test set of documents degraded by uniform noise
Keywords
document image processing; optical character recognition; English language text images; OCR algorithm; OCR results; experimental results; optical character recognition algorithm; performance; postprocessing algorithms; substrings; text document; uniform noise; visual interword constraints; word images; word level error rate; Character recognition; Clustering algorithms; Degradation; Error analysis; Natural languages; Noise level; Noise reduction; Optical character recognition software; Optical noise; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Image Processing, 1995. Proceedings., International Conference on
Conference_Location
Washington, DC
Print_ISBN
0-8186-7310-9
Type
conf
DOI
10.1109/ICIP.1995.537638
Filename
537638
Link To Document