Title :
Visual inter-word relations and their use in OCR postprocessing
Author :
Hong, Tao ; Hull, Jonathan J.
Author_Institution :
CEDAR, State Univ. of New York, Buffalo, NY, USA
Abstract :
A technique is presented that uses visual relationships between word images in a document to improve the recognition of the text it contains. This technique takes advantage of the visual relationships between word images that are usually lost in most conventional optical character recognition (OCR) techniques. The visual relations are defined to be the equivalence that exists between images of the same word or portions of word images. An algorithm is presented that calculates these relationships in a document. The resulting clusters are integrated with the recognition results provided by an OCR system. Inconsistencies in OCR results between equivalent images are identified and used to improve recognition performance. Experimental results are presented in which the input is provided directly from a commercial OCR system
Keywords :
document image processing; optical character recognition; OCR postprocessing; character recognition; document; equivalent images; inter-word relations; recognition performance; word images; Character recognition; Clustering algorithms; Digital images; Image analysis; Image recognition; Image segmentation; Marine vehicles; Optical character recognition software; Text analysis; Text recognition;
Conference_Titel :
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-8186-7128-9
DOI :
10.1109/ICDAR.1995.599031