DocumentCode :
3062058
Title :
Detecting and recognizing numerical strings in Farsi document images
Author :
Abedi, Ali ; Faez, Karim ; Mozaffari, Saeed
Author_Institution :
Electr. Eng. Dept., Amirkabir Univ. of Technol., Tehran, Iran
fYear :
2009
fDate :
23-25 Nov. 2009
Firstpage :
403
Lastpage :
408
Abstract :
In this paper, we propose a new approach for detecting and recognizing numerical strings in Farsi/Arabic handwritten or machine-printed document images. We assign a label to each of the connected components as they belong to a numerical string or not. First, in order to differentiate between digit and non-digit connected components, some simple features are extracted from all connected components in each text line. Then, these features are classified with a fuzzy rule-based classifier to extract some candidate strings. After using a digit recognizer, syntax of the numerical strings are validated by a syntactic verifier. Experimental results show an acceptable detection rate with low false positive rate.
Keywords :
document image processing; feature extraction; fuzzy set theory; image classification; object detection; string matching; Farsi document images; Farsi-Arabic handwritten; digit recognizer; feature extraction; fuzzy rule-based classifier; machine-printed document images; numerical string detecting; numerical string recognition; Character recognition; Computer vision; Costs; Data mining; Feature extraction; Handwriting recognition; Image converters; Image recognition; Optical character recognition software; Text analysis; Farsi/Arabic document analysis; Feature extraction; Information extraction; Numerical Strings;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Image and Vision Computing New Zealand, 2009. IVCNZ '09. 24th International Conference
Conference_Location :
Wellington
ISSN :
2151-2205
Print_ISBN :
978-1-4244-4697-1
Electronic_ISBN :
2151-2205
Type :
conf
DOI :
10.1109/IVCNZ.2009.5378373
Filename :
5378373
Link To Document :
بازگشت