Title :
Classification of forms with similar layouts based on Mixed Gaussian Weighted Mask
Author :
Simeng Wang;Liangcai Gao;Yuehan Wang
Author_Institution :
Institute of Computer Science and Technology, Peking University, Beijing, China 100871
Abstract :
As an essential step of form processing, form classification has attracted much attention from researchers. However, for the forms with similar layout, most of the previous classification methods still suffer from two issues: huge variation among areas of user-filled-in data and insufficient discriminative identifiers in areas of preprinted data. In this paper, we propose a novel Mixed Gaussian Weighted Mask (MGWM) based method to identify forms with similar layouts by leveraging the multiple information extracted from areas of user-filled-in data, areas of preprinted data and dithering data of a form. The proposed method utilizes a combination of three Gaussian weighted masks to mitigate the impact of noise from areas of user-filled-in data, layout consistency and position dithering among form images respectively. Experimental results show that the proposed method achieves more than 85% classification accuracy on a number of forms and outperforms the state-of-the-art form classification method.
Keywords :
Optical imaging
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
DOI :
10.1109/ICDAR.2015.7333736