Title :
A parallel algorithm for “document segmentation”
Author :
Ancona, M. ; De Benedetto, M.
Author_Institution :
Dipartamento di Inf. e Scienza dell´´Inf., Genoa Univ., Italy
Abstract :
We present a parallel algorithm for physical segmentation of technical documents. The proposed method follows a “data parallel” approach, based on a divide and conquer implementation. A document page is statically partitioned into n equal-sized rectangular blocks, where n is the number of processors. Each processor independently finds a segmentation of its assigned block, according to the same rules: row/column or-ing and profile xor-ing. Each segmentation is stored in form of xy-tree. The computed trees are combined, in pairs and in parallel, without re-examining the original image. In the paper we prove that the independently computed xy-trees can be efficiently combined, without using the original image to form the global tree, obtained by a sequential application of the algorithm to the image. The method has been implemented on a LAN of workstations communicating through the PVM3 system
Keywords :
divide and conquer methods; document image processing; image segmentation; parallel algorithms; tree data structures; PVM3 system; data parallel approach; divide and conquer implementation; document segmentation; parallel algorithm; xy-tree; xy-trees; Algorithm design and analysis; Application software; Concurrent computing; Image analysis; Image resolution; Image segmentation; Local area networks; Parallel algorithms; Text analysis; Workstations;
Conference_Titel :
Parallel and Distributed Processing, 1995. Proceedings. Euromicro Workshop on
Conference_Location :
San Remo
Print_ISBN :
0-8186-7031-2
DOI :
10.1109/EMPDP.1995.389168