Title :
A new method to separation of Farsi and Arabic sub-words using image processing techniques
Author :
Shirvani, P. ; Khouzani, M.V.
Author_Institution :
Dept. of Electr. Eng., Univ. of Semnan, Semnan, Iran
Abstract :
Letters separation and word´s units is one of the most important parts of text recognition algorithms. In the Farsi language, these parts consist of single letters and connected letters which called “sub-word”. Therefore, separation for these units has a main role in developing text processing´s algorithms. In this paper, a method based on connected-component labeling techniques with high accuracy is suggested that makes letters and sub-words separation in Farsi font in any size, possible. Experiments show more than 90 percent accuracy in this method.
Keywords :
image processing; natural language processing; text detection; Arabic sub-words; Farsi language; Farsi sub-words; connected-component labeling techniques; image processing techniques; letter separation; sub-words separation; text processing algorithms; text recognition algorithms; Accuracy; Conferences; Image recognition; Labeling; Signal processing algorithms; Text recognition;
Conference_Titel :
Pattern Recognition and Image Analysis (PRIA), 2013 First Iranian Conference on
Conference_Location :
Birjand
Print_ISBN :
978-1-4673-6204-7
DOI :
10.1109/PRIA.2013.6528457