• DocumentCode
    1376003
  • Title

    Adaptive document block segmentation and classification

  • Author

    Shih, Frank Y. ; Chen, Shy-Shyan

  • Author_Institution
    Comput. Vision Lab., New Jersey Inst. of Technol., Newark, NJ, USA
  • Volume
    26
  • Issue
    5
  • fYear
    1996
  • fDate
    10/1/1996 12:00:00 AM
  • Firstpage
    797
  • Lastpage
    802
  • Abstract
    This paper presents an adaptive block segmentation and classification technique for daily-received office documents having complex layout structures such as multiple columns and mixed-mode contents of text, graphics, and pictures. First, an improved two-step block segmentation algorithm is performed based on run-length smoothing for decomposing any document into single-mode blocks. Then, a rule-based block classification is used for classifying each block into the text, horizontal/vertical line, graphics, or-picture type. The document features and rules used are independent of character font and size and the scanning resolution. Experimental results show that our algorithms are capable of correctly segmenting and classifying different types of mixed-mode printed documents
  • Keywords
    document image processing; fuzzy control; image classification; image segmentation; knowledge based systems; adaptive block classification; adaptive document block segmentation; complex layout structures; daily-received office documents; mixed-mode contents; multiple columns; rule-based block classification; run-length smoothing; Control systems; Design methodology; Fuzzy control; Fuzzy logic; Fuzzy systems; Graphics; Notice of Violation; Robust control; Three-term control; Two-term control;
  • fLanguage
    English
  • Journal_Title
    Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4419
  • Type

    jour

  • DOI
    10.1109/3477.537322
  • Filename
    537322