DocumentCode :
2197080
Title :
A Novel Arabic Baseline Estimation Algorithm Based on Sub-Words Treatment
Author :
Boukerma, Hanene ; Farah, Nadir
Author_Institution :
Lab. de Gestion Electron. du Document (LABGED), Univ. 20 Aout 1955, Skikda, Algeria
fYear :
2010
fDate :
16-18 Nov. 2010
Firstpage :
335
Lastpage :
338
Abstract :
Baseline detection is an essential preprocessing step for many OCR systems, it has a direct effect on the efficiency and reliability of characters segmentation and features extraction stages, which contribute strongly to yielding higher recognition accuracy. For Arabic handwritten, the conventional methods which extract baseline as straight line are ill-suited because some Arabic words may be contracted from two or more sub-words (PAWs), and the distribution of these sub-words can produce different slant angles within the same word. Focused on the source of the problem, we propose a novel Arabic baseline estimation algorithm in which the PAW level is the real basic block to be processed rather than word level. Experimental results using IFN/ENIT [1] database demonstrate the efficiency of the proposed algorithm.
Keywords :
edge detection; feature extraction; handwritten character recognition; image segmentation; natural languages; optical character recognition; word processing; Arabic handwritten character recognition; OCR system; PAW level; arabic baseline estimation algorithm; baseline detection; character segmentation reliability; feature extraction; optical character recognition; subword treatment; Arabic handwritten; baseline detection; preprocessing; sub-word extraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
Conference_Location :
Kolkata
Print_ISBN :
978-1-4244-8353-2
Type :
conf
DOI :
10.1109/ICFHR.2010.58
Filename :
5693545
Link To Document :
بازگشت