Title :
A form dropout system
Author :
Yu, Bin ; Jain, Anil K.
Author_Institution :
Dept. of Comput. Sci., Michigan State Univ., East Lansing, MI, USA
Abstract :
This paper describes a system for form dropout when the filled-in characters or symbols are either touching or crossing the form frames and the form model is unknown. Since some of the character strokes are either touching or crossing the form frames, we need to address the following three issues: (i) localization of form frames; (ii) separation between characters and form frames, and (ii) reconstruction of broken strokes introduced during separation. The form frame is automatically located by finding long straight lines based on a data structure, called block adjacency graph. Form frame removal and character reconstruction are implemented in this graph. When the same process is applied to a blank form, followed by the procedure of connected component extraction and clustering, a form structure-based template is automatically generated which includes form model, skew angle and preprinted data areas. Given the form template, our system can extract both handwritten and machine-typed filled-in data. Experimental results on three different types of forms demonstrate the performance of our system
Keywords :
data structures; document image processing; image reconstruction; image segmentation; optical character recognition; block adjacency graph; broken stroke reconstruction; character reconstruction; component clustering; connected component extraction; data structure; form dropout system; form frame localization; form frame removal; form model; handwritten data; machine-typed data; preprinted data areas; skew angle; Computer science; Costs; Data mining; Data structures; Government; Graphics; Image segmentation; Ink; Machine intelligence; Text analysis;
Conference_Titel :
Pattern Recognition, 1996., Proceedings of the 13th International Conference on
Conference_Location :
Vienna
Print_ISBN :
0-8186-7282-X
DOI :
10.1109/ICPR.1996.547036