DocumentCode :
3141458
Title :
MergeLayouts-overcoming faulty segmentations by a comprehensive voting of commercial OCR devices
Author :
Klink, Stefan ; Jäger, Thorsten
Author_Institution :
German Res. Center for Artificial Intelligence, Kaiserslautern, Germany
fYear :
1999
fDate :
20-22 Sep 1999
Firstpage :
386
Lastpage :
389
Abstract :
In this paper we present a comprehensive voting approach, taking entire layouts obtained from commercial OCR devices as input. Such a layout comprises segments of three kinds: lines, words, and characters. By combining all attributes of a segment (e.g. recognized text, font height etc.), we attain a “better” layout, representing the original page layout as good as possible. The voting process itself is hierarchically organized, starting with the line segments. For each level, a search tree is spawn and all fellow segments (segments front different layouts which denote the same image area) are established. A heuristic search method is utilized which is guided by a similarity measure defined on segments. Deviations in the segmentation, as well as segmentation errors of individual commercial OCR devices, are compensated by an “equalization module”
Keywords :
image classification; image segmentation; merging; optical character recognition; MergeLayouts; characters; commercial OCR devices; comprehensive voting; equalization module; faulty segmentations; heuristic search method; line segments; lines; page layout; search tree; similarity measure; words; Artificial intelligence; Character recognition; Fellows; Image segmentation; Optical character recognition software; Path planning; Search problems; Voting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
Type :
conf
DOI :
10.1109/ICDAR.1999.791805
Filename :
791805
Link To Document :
بازگشت