DocumentCode
2014679
Title
Example-Based Logical Labeling of Document Title Page Images
Author
van Beusekom, J. ; Keysers, Daniel ; Shafait, Faisal ; Breuel, Thomas M.
Author_Institution
Tech. Univ. Kaiserslautern, Kaiserslautern
Volume
2
fYear
2007
fDate
23-26 Sept. 2007
Firstpage
919
Lastpage
923
Abstract
This paper presents a flexible and effective example- based approach for labeling title pages which can be used for automated extraction of bibliographic data. The labels of interest are "title", "author", "abstract" and "affiliation". The method takes a set of labeled document layouts and a single unlabeled document layout as input and finds the best matching layout in the set. The labels of this layout are used to label the new layout. The similarity measure for layouts combines structural layout similarity and textural similarity on the block-level. Experimental results yield accuracy rates from 94.8% to 99.6% obtained on the publicly available MARG dataset. This shows that our lightweight method has equivalent and partially better performance when compared to other more complex labeling methods known from the literature.
Keywords
document handling; document title page image labeling; example-based logical labeling;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location
Parana
ISSN
1520-5363
Print_ISBN
978-0-7695-2822-9
Type
conf
DOI
10.1109/ICDAR.2007.4377049
Filename
4377049
Link To Document