DocumentCode :
3108219
Title :
Text line extraction from handwritten document pages using spiral run length smearing algorithm
Author :
Malakar, Samir ; Halder, Sebastian ; Sarkar, Rituparna ; Das, Niladri ; Basu, Sreetama ; Nasipuri, Mita
Author_Institution :
Dept. of Master of Comput. Applic., MCKV Inst. of Eng., Liluah, India
fYear :
2012
fDate :
28-29 Dec. 2012
Firstpage :
616
Lastpage :
619
Abstract :
Extraction of text lines from document images is one of the important steps in the process of an Optical Character Recognition (OCR) system. In case of handwritten document images, presence of skewed, touching or overlapping text line(s) makes this process a real challenge to the researcher. In the present work, a new text line extraction technique based on Spiral Run Length Smearing Algorithm (SRLSA) is reported. Firstly, digitized document image is partitioned into a number of vertical fragments of equal width. Then all the text line segments present in these fragments are identified by applying SRLSA. Finally, the neighboring text line segments are analyzed and merged (if necessary) to place them inside the same text line boundary in which they actually belong. For experimental purpose, the technique is tested on CMATERdb1.1.1 and CMATERdb1.2.1 databases. The present technique extracts 87.09% and 89.35% text lines successfully from the said databases respectively.
Keywords :
feature extraction; visual databases; CMATERdb1.1.1; CMATERdb1.2.1; digitized document image; handwritten document images; handwritten document pages; optical character recognition system; spiral run length smearing algorithm; text line extraction; text line segments; vertical fragments; Decision support systems; Intelligent systems; CMATERdb; Handwritten document pages; OCR; SRLSA; Text line extraction; Vertical partitioning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications, Devices and Intelligent Systems (CODIS), 2012 International Conference on
Conference_Location :
Kolkata
Print_ISBN :
978-1-4673-4699-3
Type :
conf
DOI :
10.1109/CODIS.2012.6422278
Filename :
6422278
Link To Document :
بازگشت