Title :
A handwriting textline extraction approach based on connected domain
Author :
Gao, Wei ; Sun, Fuchun ; Yin, Zhonghang
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Abstract :
This paper describes an approach for extracting words, textlines and text blocks by analyzing the spatial configuration of connected domain and word contour rectangles on a given document image. The basic idea is that connected components of black pixels and contours can be used as computational units in document image analysis. In this paper, we try to find a spatial feature and overlapped relationships for every contour rectangle, and we call this feature rectangle “Standard Rectangle”(SR). Then we calculate the split line of every textline according to a series of operations of SRs, and separate the word contour rectangles to different lines. In the next step we estimate that if the adjacent textlines is overlapped. If it is, we calculate the overlap distance and move the word contour rectangles according to it. Our experiment show the approach does good work on both overlapped textlines and detached textlines.
Keywords :
document image processing; edge detection; handwriting recognition; text analysis; document image analysis; handwriting textline extraction approach; word contour rectangles; Artificial neural networks; Layout; Noise; Pixel; Signal processing algorithms; Strontium; Surface morphology; connected domain; handwriting; textline extraction;
Conference_Titel :
Cognitive Informatics (ICCI), 2010 9th IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-8041-8
DOI :
10.1109/COGINF.2010.5599738