DocumentCode
311137
Title
ODIL: an SGML description language of the layout structure of documents
Author
Lefèvre, Philippe ; Reynaud, François
Author_Institution
Electr. de France, Clamart, France
Volume
1
fYear
1995
fDate
14-16 Aug 1995
Firstpage
480
Abstract
This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named “ODIL”-Office Document Image description Language-that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML elements, and their characteristics are defined by SGML attributes. The basic objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language: texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DTD will permit to use SGML tools for the logical structure recognition which is viewed as an SGML up-conversion problem
Keywords
document image processing; image segmentation; page description languages; OCR; ODIL; ODL; Office Document Image description Language; RAINBOW transit DTD; SGML; SGML description language; coding format; document recognition; document recognition prototype; layout structure; logical structure recognition; segmentation; up-conversion; Graphics; Image recognition; Image segmentation; Layout; Mathematics; Optical character recognition software; Pixel; Proposals; Prototypes; SGML;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location
Montreal, Que.
Print_ISBN
0-8186-7128-9
Type
conf
DOI
10.1109/ICDAR.1995.599040
Filename
599040
Link To Document