Title :
Parameter-independent geometric document layout analysis
Author :
Ryu, Dae-Seok ; Kang, Sun-mee ; Lee, Seong-Whan
Author_Institution :
Center for Artificial Vision Res., Korea Univ., Seoul, South Korea
Abstract :
We propose a new method independent of parameters for segmenting the document images into maximal homogeneous regions and identifying them as texts, images, tables and lines. A pyramidal quadtree structure is constructed for multiscale analysis and top-down approach, and a periodicity measure is suggested to find a periodical attribute of text regions. To obtain robust page segmentation results, a confirmation procedure using texture analysis is applied to only ambiguous regions. Experimental results with the document database from the University of Washington show that the proposed method works better than the previous ones
Keywords :
computational geometry; document image processing; image segmentation; image texture; quadtrees; ambiguous regions; confirmation procedure; document database; document image segmentation; maximal homogeneous region type identification; multiscale analysis; parameter-independent geometric document layout analysis; parameter-independent method; periodical attribute; periodicity measure; pyramidal quadtree structure; robust page segmentation; text regions; texture analysis; top-down approach; Books; Computer science; Content based retrieval; Databases; Humans; Image segmentation; Image texture analysis; Information technology; Robustness; Text analysis;
Conference_Titel :
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
0-7695-0750-6
DOI :
10.1109/ICPR.2000.902942