DocumentCode
627282
Title
Segmentation of handwritten Bangla script
Author
Rahman, Aminur ; Cyrus, Hossain Md ; Yasir, Farhad ; Adnan, Waliul Bari ; Islam, Md Minarul
Author_Institution
Dept. of Comput. Sci. & Eng., Bangladesh Univ. of Eng. & Technol., Dhaka, Bangladesh
fYear
2013
fDate
17-18 May 2013
Firstpage
1
Lastpage
5
Abstract
Segmentation of handwritten Bangla script is one of the most critical areas of the Optical Character Recognition System. Paying attention on the various writing style of different individuals we propose an efficient scheme to segment unconstrained handwritten Bangla script into lines, words and characters. At First for Line Segmentation, we divide the whole script into column segment. These segments are calculated by the mode of the width of each black pixel region. In each column segment, we mark potential line markers considering the height of black pixel regions. We compute a set of potential line markers for each segment and join them using the Construct Line Algorithm method. The algorithm is used to segment the text lines. Considering the width of the black pixel regions and computing the distance between two consecutive black pixel regions, lines are segmented into words. In handwritten word, determining the Matra is necessary to segment the word into characters. We take the word into minimum bounding box and consider those black pixels where the vertical flow of white pixels block. The mode of the vertical positions of these black pixels is determined to find the Matra zone where the characters are connected with one another. Considering pixel density of these connections between two characters are determined to divide the words into characters.
Keywords
handwritten character recognition; image segmentation; natural language processing; optical character recognition; text analysis; word processing; Matra zone; black pixel region height; black pixel region width; black pixel vertical position mode; column segmentation; construct line algorithm method; handwritten word segmentation; line markers; minimum bounding box; optical character recognition system; pixel density; text line segmentation; unconstrained handwritten Bangla script segment; white pixel block vertical flow; writing style; Accuracy; Approximation methods; Character recognition; Handwriting recognition; Image segmentation; Noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Informatics, Electronics & Vision (ICIEV), 2013 International Conference on
Conference_Location
Dhaka
Print_ISBN
978-1-4799-0397-9
Type
conf
DOI
10.1109/ICIEV.2013.6572635
Filename
6572635
Link To Document