DocumentCode
3143176
Title
An algorithm for extracting cursive text lines
Author
Bruzzone, Elisabetta ; Coffetti, Meri Cristina
Author_Institution
R&D Dept., Elsag SpA, Genova, Italy
fYear
1999
fDate
20-22 Sep 1999
Firstpage
749
Lastpage
752
Abstract
In this paper a new algorithm for extracting text lines from a cursive image field is described. The proposed algorithm is a fast and satisfactorily accurate procedure for isolating text lines without loss of information. The algorithm is based on the analysis of horizontal run projections and connected component grouping and splitting on a partition of the input image into vertical strips, in order to deal with undulating or skewed text. The goal of the algorithm is to prevent the ascending and descending characters from being corrupted by arbitrary cuts. The algorithm has been designed for cursive text and can also be applied to handwritten text. It maintains punctuation to allow a better performance word extraction in a subsequent phase of handwritten line processing
Keywords
document image processing; edge detection; handwritten character recognition; image segmentation; ascending characters; connected component grouping; connected component splitting; cursive image field; cursive text line extraction algorithm; descending characters; handwritten line processing; handwritten text; horizontal run projections; input image partitioning; isolating text lines; skewed text; undulating text; vertical strips; word extraction; Algorithm design and analysis; Character recognition; Data mining; Image recognition; Image segmentation; Law; Legal factors; Pixel; Read only memory; Research and development;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location
Bangalore
Print_ISBN
0-7695-0318-7
Type
conf
DOI
10.1109/ICDAR.1999.791896
Filename
791896
Link To Document