DocumentCode :
2629167
Title :
Image based typographic analysis of documents
Author :
Doermann, David S. ; Furuta, Richard
Author_Institution :
Center for Autom. Res., Maryland Univ., College Park, MD, USA
fYear :
1993
fDate :
20-22 Oct 1993
Firstpage :
769
Lastpage :
773
Abstract :
An approach to image based typographic analysis of documents is provided. The problem requires a spatial understanding of the document layout as well as knowledge of the proper syntax. The system performs a page synthesis from the stream of formatting commands defined in a DVI file. Since the two-dimensional relationships between document components are not explicit in the page language, the authors develop a representation which preserves the two-dimensional layout, the read-order and the attributes of document components. From this hierarchical representation of the page layout we extract and analyze relevant typographic features such as margins, line and character spacing, and figure placement
Keywords :
document image processing; feature extraction; page description languages; spatial data structures; 2D relationships; DVI file; character spacing; data representation; document component attributes; document layout; feature extraction; figure placement; formatting commands; hierarchical representation; image based typographic analysis; line spacing; margins; page language; page layout; page synthesis; read-order; spatial understanding; syntax; Automation; Computer errors; Graphics; Image analysis; Layout; Page description languages; Printers; Printing; Text analysis; Typesetting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
Conference_Location :
Tsukuba Science City
Print_ISBN :
0-8186-4960-7
Type :
conf
DOI :
10.1109/ICDAR.1993.395624
Filename :
395624
Link To Document :
بازگشت