• DocumentCode
    319602
  • Title

    Intelligent hierarchical layout segmentation of document images on the basis of colour content

  • Author

    Mighlani, D. ; Hennig, A. ; Sherkat, N. ; Whitrow, R.J.

  • Author_Institution
    Dept. of Comput., Trento Univ., Italy
  • Volume
    1
  • fYear
    1997
  • fDate
    4-4 Dec. 1997
  • Firstpage
    191
  • Abstract
    This paper proposes a general methodology for automatic layout segmentation of documents. We first use colour histograms for extracting dominant colours of an image. This information is then used to hierarchically segment documents into regions of interest represented as polygons. If a region of interest is a picture the algorithm intelligently refrains from segmenting it further, while coloured regions that contain text are subsegmented. The method has been tested on 50 real life documents, such as office letters, brochures, and technical papers, scanned at 100×100 dpi resolution. Regions are detected with about 68% reliability. A critical analysis of the results is presented.
  • Keywords
    document image processing; feature extraction; image colour analysis; image representation; image segmentation; algorithm; automatic layout segmentation; brochures; colour content; colour extraction; colour histograms; document images; image database; image representation; intelligent hierarchical layout segmentation; office letters; polygons; regions of interest; resolution; technical papers; Data mining; Graphics; Histograms; Image databases; Image segmentation; Life testing; Microcomputers; Pixel; Process design; World Wide Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications., Proceedings of IEEE
  • Conference_Location
    Brisbane, Qld., Australia
  • Print_ISBN
    0-7803-4365-4
  • Type

    conf

  • DOI
    10.1109/TENCON.1997.647289
  • Filename
    647289