Title :
Text line extraction from handwritten document pages based on line contour estimation
Author :
Sarkar, Rituparna ; Halder, Sebastian ; Malakar, Samir ; Das, Niladri ; Basu, Sreetama ; Nasipuri, Mita
Author_Institution :
Dept. of Comput. Sci. & Eng., Jadavpur Univ., Kolkata, India
Abstract :
Extraction of text lines from handwritten/printed document images is one of the important steps in the process of an Optical Character Recognition (OCR) system. In case of handwritten document images, presence of skewed, touching or overlapping text line(s) makes this process a real challenge to the researcher. In the present work, a new text line extraction technique based on line contour estimation is reported. Here, digitized document image is initially partitioned into a number of vertical fragments of equal width. Then all the line segments present in these vertical fragments are detected. Finally, the neighboring line segments are analyzed to place them inside the line boundary in which they actually belong. For experimental purpose, the developed technique is tested on CMATERdb1.2.1 database and present technique extracts 88.44% text lines successfully.
Keywords :
document image processing; feature extraction; optical character recognition; visual databases; CMATERdb1.2.1 database; OCR system; handwritten document page; line contour estimation; line segment; optical character recognition; printed document image; text line extraction; vertical fragment; Image segmentation; Integrated optics; Optical imaging; Random access memory; CMATERdb; Contour estimation; Handwritten document pages; Multi-skewed text line; OCR; Text line extraction; Vertical partitioning;
Conference_Titel :
Computing Communication & Networking Technologies (ICCCNT), 2012 Third International Conference on
Conference_Location :
Coimbatore
DOI :
10.1109/ICCCNT.2012.6395873