DocumentCode :
627282
Title :
Segmentation of handwritten Bangla script
Author :
Rahman, Aminur ; Cyrus, Hossain Md ; Yasir, Farhad ; Adnan, Waliul Bari ; Islam, Md Minarul
Author_Institution :
Dept. of Comput. Sci. & Eng., Bangladesh Univ. of Eng. & Technol., Dhaka, Bangladesh
fYear :
2013
fDate :
17-18 May 2013
Firstpage :
1
Lastpage :
5
Abstract :
Segmentation of handwritten Bangla script is one of the most critical areas of the Optical Character Recognition System. Paying attention on the various writing style of different individuals we propose an efficient scheme to segment unconstrained handwritten Bangla script into lines, words and characters. At First for Line Segmentation, we divide the whole script into column segment. These segments are calculated by the mode of the width of each black pixel region. In each column segment, we mark potential line markers considering the height of black pixel regions. We compute a set of potential line markers for each segment and join them using the Construct Line Algorithm method. The algorithm is used to segment the text lines. Considering the width of the black pixel regions and computing the distance between two consecutive black pixel regions, lines are segmented into words. In handwritten word, determining the Matra is necessary to segment the word into characters. We take the word into minimum bounding box and consider those black pixels where the vertical flow of white pixels block. The mode of the vertical positions of these black pixels is determined to find the Matra zone where the characters are connected with one another. Considering pixel density of these connections between two characters are determined to divide the words into characters.
Keywords :
handwritten character recognition; image segmentation; natural language processing; optical character recognition; text analysis; word processing; Matra zone; black pixel region height; black pixel region width; black pixel vertical position mode; column segmentation; construct line algorithm method; handwritten word segmentation; line markers; minimum bounding box; optical character recognition system; pixel density; text line segmentation; unconstrained handwritten Bangla script segment; white pixel block vertical flow; writing style; Accuracy; Approximation methods; Character recognition; Handwriting recognition; Image segmentation; Noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Informatics, Electronics & Vision (ICIEV), 2013 International Conference on
Conference_Location :
Dhaka
Print_ISBN :
978-1-4799-0397-9
Type :
conf
DOI :
10.1109/ICIEV.2013.6572635
Filename :
6572635
Link To Document :
بازگشت