DocumentCode
2540362
Title
A handwriting textline extraction approach based on connected domain
Author
Gao, Wei ; Sun, Fuchun ; Yin, Zhonghang
Author_Institution
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
fYear
2010
fDate
7-9 July 2010
Firstpage
217
Lastpage
222
Abstract
This paper describes an approach for extracting words, textlines and text blocks by analyzing the spatial configuration of connected domain and word contour rectangles on a given document image. The basic idea is that connected components of black pixels and contours can be used as computational units in document image analysis. In this paper, we try to find a spatial feature and overlapped relationships for every contour rectangle, and we call this feature rectangle “Standard Rectangle”(SR). Then we calculate the split line of every textline according to a series of operations of SRs, and separate the word contour rectangles to different lines. In the next step we estimate that if the adjacent textlines is overlapped. If it is, we calculate the overlap distance and move the word contour rectangles according to it. Our experiment show the approach does good work on both overlapped textlines and detached textlines.
Keywords
document image processing; edge detection; handwriting recognition; text analysis; document image analysis; handwriting textline extraction approach; word contour rectangles; Artificial neural networks; Layout; Noise; Pixel; Signal processing algorithms; Strontium; Surface morphology; connected domain; handwriting; textline extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Cognitive Informatics (ICCI), 2010 9th IEEE International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-8041-8
Type
conf
DOI
10.1109/COGINF.2010.5599738
Filename
5599738
Link To Document