Title :
Recognition of middle age Persian characters using a set of invariant moments
Author :
Alirezaee, S. ; Aghaeinia, H. ; Ahmadi, M. ; Faez, K.
Author_Institution :
Dept. of Electr. Eng., Amirkabir Univ. of Technol., Tehran, Iran
Abstract :
In this paper, recognition of ancient middle Persian documents is studied. Our major attention has been focused on feature extraction and classification. A set of invariant moments has been selected as the features and the minimum mean distance (three versions of which that is called MMD1, MMD2, MMD3), KNN and Parzen as the classifier. Preprocessing is also considered in this paper which allows, the effects of under sampling (resolution pyramids), smoothing, and thinning be investigated. The algorithm has been tested not only on the original and smoothed images but also on the skeletonized and under sampled version of the text under test. The results show an acceptable recognition rate with the selected features with the proposed processing for the middle age Persian. The best-achieved classification rates are 95% and 90.5% for smoothed and original character images respectively. It was interesting to note that KNN and MMD2 classifiers yielded better recognition rate.
Keywords :
character recognition; document image processing; feature extraction; image resolution; natural languages; neural nets; sampling methods; KNN; Parzen; ancient middle Persian documents; character images; feature classification; feature extraction; invariant moment set; middle age Persian character recognition; minimum mean distance; resolution pyramids; under sampling; Books; Character recognition; Entropy; Feature extraction; Image sampling; Optical character recognition software; Shape; Smoothing methods; Testing; Text recognition;
Conference_Titel :
Information Theory, 2004. ISIT 2004. Proceedings. International Symposium on
Print_ISBN :
0-7695-2250-5
DOI :
10.1109/AIPR.2004.39