• DocumentCode
    3058172
  • Title

    A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents

  • Author

    Lebourgeois, F. ; Bublinski ; Emptoz, H.

  • Author_Institution
    Lab. de Modelisation des Syst. et Reconnaissance de Formes, INSA de Lyon, Villeurbanne, France
  • fYear
    1992
  • fDate
    30 Aug-3 Sep 1992
  • Firstpage
    272
  • Lastpage
    276
  • Abstract
    Outlines a fast and efficient method for extracting graphics and text paragraphs from printed documents. The method presented is based on bottom-up approach to document analysis and it achieves very good performance in most cases. During the preprocessing characters are linked together to form blocks. Created blocks are segmented, labelled and merged into paragraphs. Simultaneously, graphics are extracted from the image. Algorithms for each step of processing are presented. Also, the obtained experimental results are included
  • Keywords
    document image processing; image segmentation; text editing; document analysis; document processing; graphics extraction; labelling; run length smoothing algorithm; segmentation; text paragraph extraction; unconstrained documents; Data mining; Graphics; Image analysis; Image segmentation; Joining processes; Performance analysis; Pixel; Reconnaissance; Smoothing methods; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1992. Vol.II. Conference B: Pattern Recognition Methodology and Systems, Proceedings., 11th IAPR International Conference on
  • Conference_Location
    The Hague
  • Print_ISBN
    0-8186-2915-0
  • Type

    conf

  • DOI
    10.1109/ICPR.1992.201771
  • Filename
    201771