DocumentCode :
1634731
Title :
Text Line Segmentation Based on Morphology and Histogram Projection
Author :
dos Santos, Rodrigo P. ; Clemente, Gabriela S. ; Ren, Tsang Ing ; Cavalcanti, G.D.C.
Author_Institution :
Center of Inf., Fed. Univ. of Pernambuco, Recife, Brazil
fYear :
2009
Firstpage :
651
Lastpage :
655
Abstract :
Text extraction is an important phase in document recognition systems. In order to segment text from a page document it is necessary to detect all the possible manuscript text regions. In this article we propose an efficient algorithm to segment handwritten text lines. The text line algorithm uses a morphological operator to obtain the features of the images. Following, a sequence of histogram projection and recovery is proposed to obtain the line segmented region of the text. First, an Y histogram projection is performed which results in the text lines positions. To divide the lines in different regions a threshold is applied. After that, another threshold is used to eliminate false lines. These procedures, however, cause some loss on the text line area. So, a recovery method is proposed to minimize this effect. In order to detect the extreme positions of the text in the horizontal direction, an X histogram projection is applied. Then, as in the Y direction, another threshold is used to eliminate false words. Finally, in order to optimize the area of the manuscript text line, a text selection is carried out. Experimental results using the IAM-database showed that this new approach is robust, fast and produces very good score rates.
Keywords :
document image processing; feature extraction; handwritten character recognition; image segmentation; text analysis; document recognition system; feature extraction; handwritten text line segmentation; histogram projection; manuscript text; morphological operator; Cultural differences; Data mining; Histograms; Image segmentation; Informatics; Morphological operations; Morphology; Robustness; Text analysis; Text recognition; Histogram Projection; Mathematical Morphology; Text Line Segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
ISSN :
1520-5363
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2009.183
Filename :
5277563
Link To Document :
بازگشت