• DocumentCode
    3209639
  • Title

    Document Page Layout Analysis Using Harris Corner Points

  • Author

    Nourbakhsh, Farshad ; Pati, Peeta Basa ; Ramakrishnan, A.G.

  • Author_Institution
    Indian Inst. of Sci., Bangalore
  • fYear
    2006
  • fDate
    Oct. 15 2006-Dec. 18 2006
  • Firstpage
    149
  • Lastpage
    152
  • Abstract
    Extraction of text areas from the document images with complex content and layout is one of the challenging tasks. Few texture based techniques have already been proposed for extraction of such text blocks. Most of such techniques are greedy for computation time and hence are far from being realizable for real time implementation. In this work, we propose a modification to two of the existing texture based techniques to reduce the computation. This is accomplished with Harris corner detectors. The efficiency of these two textures based algorithms, one based on Gabor filters and other on log-polar wavelet signature, are compared. A combination of Gabor feature based texture classification performed on a smaller set of Harris corner detected points is observed to deliver the accuracy and efficiency.
  • Keywords
    Gabor filters; document image processing; edge detection; feature extraction; image classification; image texture; text analysis; wavelet transforms; Gabor feature based texture classification; Gabor filters; Harris corner point detectors; document images; document page layout analysis; log-polar wavelet signature; text block extraction; texture based techniques; Data mining; Detectors; Gabor filters; Image analysis; Image color analysis; Laboratories; Optical devices; Pixel; Storage automation; Text analysis; Gabor Filters; Harris Corner Detector; Log-polar Wavelet Signature; Manhattan Layout; Page Layout Analysis; Text Extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Sensing and Information Processing, 2006. ICISIP 2006. Fourth International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    1-4244-0612-9
  • Electronic_ISBN
    1-4244-0612-9
  • Type

    conf

  • DOI
    10.1109/ICISIP.2006.4286083
  • Filename
    4286083