• DocumentCode
    1994113
  • Title

    Automated detection and segmentation of table of contents page from document images

  • Author

    Mandal, S. ; Chowdhury, S.P. ; Das, A.K. ; Chanda, Bhabatosh

  • Author_Institution
    Bengal Eng. Coll., Howrah, India
  • fYear
    2003
  • fDate
    3-6 Aug. 2003
  • Firstpage
    398
  • Abstract
    With an aim to extract the structural information from the table of contents (TOC) to help develop a digital document library, the requirement of identifying/segmenting the TOC page is obvious. The objective to create a digital document library is to provide a non-labour intensive, cheap and flexible way of storing, representing and managing the paper document in electronic form to facilitate indexing, viewing, printing and extracting the intended portions. Information from the TOC pages is to be extracted for use in a document database for effective retrieval of the required pages. We present a fully automatic identification and segmentation of a table of contents (TOC) page from a scanned document.
  • Keywords
    character recognition; digital libraries; document image processing; image segmentation; information retrieval; visual databases; TOC page identification; automated detection; automatic identification; digital document library development; document database; document image segmentation; document images; electronic form; information extraction; information retrieval; nonlabour intensive document storage; page segmentation; paper document; scanned document; structural information; table of contents detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
  • Print_ISBN
    0-7695-1960-1
  • Type

    conf

  • DOI
    10.1109/ICDAR.2003.1227697
  • Filename
    1227697