DocumentCode
1994113
Title
Automated detection and segmentation of table of contents page from document images
Author
Mandal, S. ; Chowdhury, S.P. ; Das, A.K. ; Chanda, Bhabatosh
Author_Institution
Bengal Eng. Coll., Howrah, India
fYear
2003
fDate
3-6 Aug. 2003
Firstpage
398
Abstract
With an aim to extract the structural information from the table of contents (TOC) to help develop a digital document library, the requirement of identifying/segmenting the TOC page is obvious. The objective to create a digital document library is to provide a non-labour intensive, cheap and flexible way of storing, representing and managing the paper document in electronic form to facilitate indexing, viewing, printing and extracting the intended portions. Information from the TOC pages is to be extracted for use in a document database for effective retrieval of the required pages. We present a fully automatic identification and segmentation of a table of contents (TOC) page from a scanned document.
Keywords
character recognition; digital libraries; document image processing; image segmentation; information retrieval; visual databases; TOC page identification; automated detection; automatic identification; digital document library development; document database; document image segmentation; document images; electronic form; information extraction; information retrieval; nonlabour intensive document storage; page segmentation; paper document; scanned document; structural information; table of contents detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
Print_ISBN
0-7695-1960-1
Type
conf
DOI
10.1109/ICDAR.2003.1227697
Filename
1227697
Link To Document