DocumentCode :
183216
Title :
Visual Perception of Unitary Elements for Layout Analysis of Unconstrained Documents in Heterogeneous Databases
Author :
Poirriez, Baptiste ; Lemaitre, A. ; Couasnon, Bertrand
Author_Institution :
Irisa - INSA, France
fYear :
2014
fDate :
1-4 Sept. 2014
Firstpage :
35
Lastpage :
40
Abstract :
The document layout analysis is a complex task in the context of heterogeneous documents. It is still a challenging problem. In this paper, we present our contribution for the layout analysis competition of the international Maurdor Campaign. Our method is based on a grammatical description of the content of elements. It consists in iteratively finding and then removing the most structuring elements of documents. This method is based on notions of perceptive vision: a combination of points of view of the document, and the analysis of salient contents. Our description is generic enough to deal with a very wide range of heterogeneous documents. This method obtained the second place in Run 2 of Maurdor Campaign (on 1000 documents), and the best results in terms of pixel labeling for text blocs and graphic regions.
Keywords :
distributed databases; document image processing; text detection; document layout analysis; grammatical description; graphic regions; heterogeneous databases; heterogeneous document; international Maurdor campaign; layout analysis competition; perceptive vision; pixel labeling; salient content; text blocs; unconstrained document; unitary elements; visual perception; Context; Databases; Image segmentation; Layout; Optical character recognition software; Text analysis; business documents; document layout analysis; forms; heterogeneous documents; tables;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location :
Heraklion
ISSN :
2167-6445
Print_ISBN :
978-1-4799-4335-7
Type :
conf
DOI :
10.1109/ICFHR.2014.14
Filename :
6980993
Link To Document :
بازگشت