• DocumentCode
    3211689
  • Title

    A rule-based system for document image segmentation

  • Author

    Fisher, James L. ; Hinds, Stuart C. ; D´Amato, Donald P.

  • Author_Institution
    Mitre Corp., McLean, VA, USA
  • Volume
    i
  • fYear
    1990
  • fDate
    16-21 Jun 1990
  • Firstpage
    567
  • Abstract
    A rule-based system for automatically segmenting a document image into regions of text and nontext is presented. The initial stages of the system perform image enhancement functions such as adaptive thresholding, morphological processing, and skew detection and correction. The image segmentation process consists of smearing the original image via the run length smoothing algorithm, calculating the connected components locations and statistics, and filtering (segmenting) the image based on these statistics. The text regions can be converted (via an optical character reader) to a computer-searchable form, and the nontext regions can be extracted and preserved. The rule-based structure allows easy fine tuning of the algorithmic steps to produce robust rules, to incorporate additional tools (as they become available), and to handle special segmentation needs
  • Keywords
    computerised pattern recognition; document image processing; knowledge based systems; statistical analysis; adaptive thresholding; computerised pattern recognition; document image segmentation; filtering; image enhancement; morphological processing; rule-based system; run length smoothing algorithm; skew detection; Filtering algorithms; Image converters; Image enhancement; Image segmentation; Knowledge based systems; Optical computing; Optical filters; Optical tuning; Smoothing methods; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1990. Proceedings., 10th International Conference on
  • Conference_Location
    Atlantic City, NJ
  • Print_ISBN
    0-8186-2062-5
  • Type

    conf

  • DOI
    10.1109/ICPR.1990.118166
  • Filename
    118166