DocumentCode :
2323292
Title :
A novel segmentation technique for splitting a typed Persian text to sub-words
Author :
Shafii, Mahnaz ; Sid-Ahmed, Maher A. ; Ahmadi, Majid
Author_Institution :
Univ. of Windsor, Windsor, ON, Canada
fYear :
2012
fDate :
2-4 May 2012
Firstpage :
1
Lastpage :
5
Abstract :
One common approach in recognition of a Persian text is to segment the text into its component words and then words to single characters. Due to specific characteristics of Persian words, this approach is non-trivial and literature reports relatively low success rate for Persian character segmentation. As an alternative, we segment a Persian text only to its component sub-words; and then, recognize sub-words from a large library of sub-words. In this document, we describe a novel segmentation technique to split a Persian text to its sub-words components with a perfect success rate for the texts and fonts tested.
Keywords :
character recognition; image recognition; image segmentation; Persian character segmentation; segmentation technique; sub-words; text recognition; typed Persian text; Algorithm design and analysis; Character recognition; Image segmentation; Labeling; Optical character recognition software; Sorting; Text recognition; OCR; Persian Text Recognition; Sub-words Segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications Control and Signal Processing (ISCCSP), 2012 5th International Symposium on
Conference_Location :
Rome
Print_ISBN :
978-1-4673-0274-6
Type :
conf
DOI :
10.1109/ISCCSP.2012.6217760
Filename :
6217760
Link To Document :
بازگشت