Title :
Structured document segmentation and representation by the modified X-Y tree
Author :
Cesarini, F. ; Gori, M. ; Marinai, Simone ; Soda, G.
Author_Institution :
DSI, Univ. di Firenze, Italy
Abstract :
We describe a top-down approach to the segmentation and representation of documents containing tabular structures. Examples of these documents are invoices and technical papers with tables. The segmentation is based on an extension of X-Y trees, where the regions are split by means of cuts along separators (e.g. lines), in addition to cuts along white spaces. The leaves describe regions containing homogeneous information and cutting separators. Adjacency links among leaves of the tree describe local relationships between corresponding regions
Keywords :
document image processing; image representation; image segmentation; tree data structures; adjacency links; cutting separators; document representation; modified X-Y tree; structured document segmentation; tabular structures; top-down approach; Electrical capacitance tomography; Hip; Identity-based encryption; Image analysis; Image segmentation; Image storage; Particle separators; Postal services; Smoothing methods; Text analysis;
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
DOI :
10.1109/ICDAR.1999.791850