Title :
An algorithm for extracting cursive text lines
Author :
Bruzzone, Elisabetta ; Coffetti, Meri Cristina
Author_Institution :
R&D Dept., Elsag SpA, Genova, Italy
Abstract :
In this paper a new algorithm for extracting text lines from a cursive image field is described. The proposed algorithm is a fast and satisfactorily accurate procedure for isolating text lines without loss of information. The algorithm is based on the analysis of horizontal run projections and connected component grouping and splitting on a partition of the input image into vertical strips, in order to deal with undulating or skewed text. The goal of the algorithm is to prevent the ascending and descending characters from being corrupted by arbitrary cuts. The algorithm has been designed for cursive text and can also be applied to handwritten text. It maintains punctuation to allow a better performance word extraction in a subsequent phase of handwritten line processing
Keywords :
document image processing; edge detection; handwritten character recognition; image segmentation; ascending characters; connected component grouping; connected component splitting; cursive image field; cursive text line extraction algorithm; descending characters; handwritten line processing; handwritten text; horizontal run projections; input image partitioning; isolating text lines; skewed text; undulating text; vertical strips; word extraction; Algorithm design and analysis; Character recognition; Data mining; Image recognition; Image segmentation; Law; Legal factors; Pixel; Read only memory; Research and development;
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
DOI :
10.1109/ICDAR.1999.791896