DocumentCode :
1796082
Title :
Prior segmentation of old Arabic manuscripts by separator word spotting
Author :
Aouadi, Nabil ; Echi, Afef Kacem
Author_Institution :
La TICE-Ensit, Univ. of Tunis, Tunis, Tunisia
fYear :
2014
fDate :
11-14 Aug. 2014
Firstpage :
31
Lastpage :
36
Abstract :
Because of the low quality of old manuscripts, the complexity of Arabic script and the different writing styles, segmenting them is a challenging problem. This work aims to preprocess these manuscripts to be correctly segmented into independent words for text recognition. The idea is to spot separator words, detach them from neighboring words if necessary and use them to segment text-lines into words. To locate separator word in these document images, we proposed a word spotting method based on Generalized Hough Transform. This method is performed using convex theory points. Around a window centered on the group of votes of the separator word, it detects all connections below text-line baseline, analyses terminal letter morphology and tries to separate between touching or overlapping components. We tested the proposed system on Arabic historical manuscripts from the 19th century onwards conserved in the Tunisian National Archives. Experiments show very encouraging results.
Keywords :
Hough transforms; document image processing; history; image segmentation; text detection; Arabic historical manuscripts; Arabic script complexity; Tunisian National Archives; convex theory points; document images; generalized Hough transform; independent words; manuscript preprocessing; overlapping components; prior old Arabic manuscript segmentation; separator word location; separator word spotting; separator word votes; terminal letter morphology analysis; text recognition; text-line baseline; text-line segmentation; touching components; writing styles; Dictionaries; Image segmentation; Junctions; Morphology; Particle separators; Transforms; Baseline; Convex Point Theory; Hough Generalized Transform; Segmentation; Skeleton; Word Spotting; angular variation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of
Conference_Location :
Tunis
Type :
conf
DOI :
10.1109/SOCPAR.2014.7007977
Filename :
7007977
Link To Document :
بازگشت