Title :
Text Line Segmentation in Images of Handwritten Historical Documents
Author :
Sánchez, A. ; Suárez, P.D. ; Mello, C.A.B. ; Oliveira, A.L.I. ; Alves, V.M.O.
Author_Institution :
Dept. Cienc. de la Comput., Univ. Rey Juan Carlos, Madrid
Abstract :
This paper describes an original method to segment handwritten text lines from historical document images. After an initial preprocessing, we compute a black/white transition map to achieve a rough detection of the line regions in the image. Using this map, the corresponding line axes are extracted through a skeletonization algorithm and the conflicts between adjacent cutting lines are solved by some heuristics. Our approach was tested on a set of handwritten digitized documents (from the PROHIST Project database) from the end of the 19th century onwards. The proposed method worked well even with difficult images and it achieved an 82.18% of correct segmented lines for our database. The results of comparing our method with other recent proposal for automatic line extraction on the same test images offered more than a 38% of correct segmentation improvement.
Keywords :
document image processing; handwritten character recognition; humanities; image segmentation; text analysis; black transition map; handwritten historical document; rough detection; skeletonization algorithm; text line image segmentation; white transition map; Automatic testing; Gray-scale; Image databases; Image processing; Image resolution; Image segmentation; Ink; Internet; Proposals; Text recognition; Image processing; document processing; handwriting; historical documents; line extraction; segmentation;
Conference_Titel :
Image Processing Theory, Tools and Applications, 2008. IPTA 2008. First Workshops on
Conference_Location :
Sousse
Print_ISBN :
978-1-4244-3321-6
Electronic_ISBN :
978-1-4244-3322-3
DOI :
10.1109/IPTA.2008.4743758