DocumentCode
3058172
Title
A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents
Author
Lebourgeois, F. ; Bublinski ; Emptoz, H.
Author_Institution
Lab. de Modelisation des Syst. et Reconnaissance de Formes, INSA de Lyon, Villeurbanne, France
fYear
1992
fDate
30 Aug-3 Sep 1992
Firstpage
272
Lastpage
276
Abstract
Outlines a fast and efficient method for extracting graphics and text paragraphs from printed documents. The method presented is based on bottom-up approach to document analysis and it achieves very good performance in most cases. During the preprocessing characters are linked together to form blocks. Created blocks are segmented, labelled and merged into paragraphs. Simultaneously, graphics are extracted from the image. Algorithms for each step of processing are presented. Also, the obtained experimental results are included
Keywords
document image processing; image segmentation; text editing; document analysis; document processing; graphics extraction; labelling; run length smoothing algorithm; segmentation; text paragraph extraction; unconstrained documents; Data mining; Graphics; Image analysis; Image segmentation; Joining processes; Performance analysis; Pixel; Reconnaissance; Smoothing methods; Text analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 1992. Vol.II. Conference B: Pattern Recognition Methodology and Systems, Proceedings., 11th IAPR International Conference on
Conference_Location
The Hague
Print_ISBN
0-8186-2915-0
Type
conf
DOI
10.1109/ICPR.1992.201771
Filename
201771
Link To Document