DocumentCode :
3340822
Title :
On the Reading of Tables of Contents
Author :
Sarkar, Prateek ; Saund, Eric
Author_Institution :
Palo Alto Res. Center, Palo Alto, CA
fYear :
2008
fDate :
16-19 Sept. 2008
Firstpage :
386
Lastpage :
393
Abstract :
This paper presents a framework for understanding tables of contents (TOC) of books, journals, and magazines. We propose a universal logical structure representation in terms of a hierarchy of entries, each of which may contain a descriptor and a locator. We enumerate graphical and perceptual cues that provide cues to parsing of tables of contents in terms of this formalism. We make initial suggestions about the form of evaluation metrics for comparing ground truthed tables of contents with the output of recognition algorithms. Typical and a typical tables of contents are used throughout to illustrate significant phenomena that must be dealt with in principled ways in any general TOC interpretation scheme. Finally we discuss implications of our observations on the design of recognition algorithms.
Keywords :
information analysis; evaluation metrics; recognition algorithms design; tables of contents; universal logical structure representation; Table of Contents; evaluation metrics; functional role labeling; ground truth; layout analysis; structure extraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location :
Nara
Print_ISBN :
978-0-7695-3337-7
Type :
conf
DOI :
10.1109/DAS.2008.87
Filename :
4669985
Link To Document :
بازگشت