DocumentCode :
3394637
Title :
Trimming approach for word segmentation with focus on overlapping characters
Author :
Gomathi, S. ; Devi, R. S. Uma ; Mohanavel, S.
Author_Institution :
Dept. of Comput. Sci., Sri Ramakrishna Eng. Coll., Coimbatore, India
fYear :
2013
fDate :
4-6 Jan. 2013
Firstpage :
1
Lastpage :
4
Abstract :
Document image analysis methods fail in case of freestyle handwritten documents, in which texts are curvilinear and gaps between words are nonuniform. This paper introduces a relatively simple method, which is more tolerant to such cases. In the proposed method, word segmentation requires the document to be already segmented into text lines. The proposed system begins with pre-processing the scanned image of the handwritten text, to increase the accuracy of recognition by enhancing some features and eliminating some inconsistencies. It solves the issue of spatial measure and threshold, which are sensitive to shape the connected component (CC), by reducing the region of interest to core region. This method rectifies the problem of segmenting words from lines using bounding box (BB) method, which suppresses the structure of character. Trimmed mean (TM) is used to detect the core region and also as threshold for gap discrimination in this segmentation method. The system was developed in Java and its performance was evaluated on word images selected from the IAM database. Applying the segmentation scheme on 1100 text lines earned 96.7% of accuracy; on the other hand BB method produced only 90.1%.
Keywords :
Java; document image processing; handwriting recognition; image segmentation; text detection; BB method; CC; IAM database; Java; TM; bounding box method; connected component; core region detection; document image analysis methods; freestyle handwritten documents; gap discrimination; overlapping characters; region of interest; text lines; trimmed mean; trimming approach; word segmentation; Databases; Handwriting recognition; Image segmentation; Measurement; Text analysis; Text recognition; Connected Component; Distance Computation; Gap Discrimination; Trimmed Mean;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Communication and Informatics (ICCCI), 2013 International Conference on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4673-2906-4
Type :
conf
DOI :
10.1109/ICCCI.2013.6466272
Filename :
6466272
Link To Document :
بازگشت