DocumentCode
2020970
Title
XML Data Representation in Document Image Analysis
Author
Belaïd, Abdel ; Falk, Ingrid ; Rangoni, Yves
Author_Institution
Univ. Nancy 2, Vandoevre-les-Nancy
Volume
1
fYear
2007
fDate
23-26 Sept. 2007
Firstpage
78
Lastpage
82
Abstract
This paper presents the XML-based formats ALTO, TEI, METS used for digital libraries and their interest for data representation in a document image analysis and recognition (DIAR) process. In the first part we briefly present these formats with focus on their adequacy for structural representation and modeling of DIAR data. The second part shows how these formats can be used in a reverse engineering process. Their implementation as a data representation framework will be shown.
Keywords
XML; document image processing; image recognition; image representation; ALTO; METS; TEI; XML data representation; XML-based formats; digital libraries; document image analysis; document image recognition; structural modeling; structural representation; Encoding; Guidelines; Image analysis; Image recognition; Optical character recognition software; Reverse engineering; Software libraries; Text analysis; Text recognition; XML;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location
Parana
ISSN
1520-5363
Print_ISBN
978-0-7695-2822-9
Type
conf
DOI
10.1109/ICDAR.2007.4378679
Filename
4378679
Link To Document