Title :
A hybrid page segmentation method
Author :
Okamoto, Masayuki ; Takahashi, Makoto
Author_Institution :
Dept. of Inf. Eng., Shinshu Univ., Nagano, Japan
Abstract :
A method of page segmentation using field separators and white streams is described and applied to the layout analysis of various types of printed pages which may have horizontal and vertical textlines. In complex page layouts, text columns which are printed closely together are often separated by thin black lines (field separators) or long white spaces (white streams). These separators are first extracted by horizontal and vertical scanning of a page, and then a global partitioning of the page into blocks is performed. Next in each block, black connected components are merged into textlines along the directions of separators horizontally or vertically. In experimental trials on various types of page layouts, such techniques produced robust and fast results
Keywords :
document image processing; image segmentation; black connected components; field separators; global partitioning; horizontal scanning; horizontal textlines; hybrid page segmentation method; layout analysis; page layouts; printed pages; text columns; vertical scanning; vertical textlines; white streams; Image segmentation; Information analysis; National electric code; Particle separators; Robustness; White spaces;
Conference_Titel :
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on
Conference_Location :
Tsukuba Science City
Print_ISBN :
0-8186-4960-7
DOI :
10.1109/ICDAR.1993.395630