DocumentCode :
2629196
Title :
Page grammars and page parsing. A syntactic approach to document layout recognition
Author :
Conway, Alan
Author_Institution :
Hitachi Dublin Lab., Trinity Coll., Dublin, Ireland
fYear :
1993
fDate :
20-22 Oct 1993
Firstpage :
761
Lastpage :
764
Abstract :
Describes a syntactic approach to deducing the logical structure of printed documents from their physical layout. Page layout is described by a two-dimensional grammar, similar to a context-free string grammar, and a chart parser is used to parse segmented page images according to the grammar. This process is part of a system which reads scanned document images and produces computer-readable text in a logical mark-up format such as SGML. The system is briefly outlined, the grammar formalism and the parsing algorithm are described in detail, and some experimental results are reported
Keywords :
context-free grammars; document image processing; image recognition; page description languages; 2D grammar; SGML; chart parser; computer-readable text; context-free string grammar; document layout recognition; logical document structure deduction; logical mark-up format; page grammars; page layout; page parsing; scanned document images; segmented page images; syntactic approach; Character recognition; Educational institutions; Graphics; Image segmentation; Indexing; Laboratories; Layout; SGML; Text recognition; Tree graphs;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
Conference_Location :
Tsukuba Science City
Print_ISBN :
0-8186-4960-7
Type :
conf
DOI :
10.1109/ICDAR.1993.395626
Filename :
395626
Link To Document :
بازگشت