Title :
A robust method for unknown forms analysis
Author :
Xingyuan, Li ; Doermann, David ; Oh, Weon-Geun ; Gao, Wen
Author_Institution :
Dept. of Comput. Sci. & Eng., Harbin Inst. of Technol., China
Abstract :
This paper proposes a strategy for analyzing unknown, filled forms. First, horizontal and vertical line segments are detected, extracted and filtered. A recursive splitting and merging algorithm eliminates overlapping segments, filters false segments, and groups the segments into lines. Based on the extracted lines, an algorithm for rectangle extraction is proposed. We define the constraints between rectangles and edges. In a process of scanning the horizontal and vertical lines, candidate edges are validated and rectangles are generated if its surrounding edges and their combination are all valid. The process is recursively applied. It can tolerate large breaks in form lines, ignore irrelevant segments and deal with embedded rectangles. Experiments on a collection of forms show that our approach works well on poor quality images
Keywords :
business forms; document image processing; merging; optical character recognition; document image processing; false segments; filled forms; horizontal line segments; merging algorithm; overlapping segments; rectangle extraction; recursive splitting; unknown forms analysis; vertical line segments; Automation; Computer science; Data engineering; Educational institutions; Image segmentation; Lapping; Machine vision; Merging; Robustness; Systems engineering and theory;
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
DOI :
10.1109/ICDAR.1999.791842