DocumentCode :
3720773
Title :
Language independent rule based classification of printed & handwritten text
Author :
Tanzila Saba;Abdulaziz S. Almazyad;Amjad Rehman
Author_Institution :
College of Computer and Information Sciences, Prince Sultan University Riyadh, KSA
fYear :
2015
Firstpage :
1
Lastpage :
4
Abstract :
Handwriting in data entry forms/documents usually indicates user´s filled information that should be treated differently from the printed text. In Arab world, these filled information are normally in English or Arabic. Secondly, classification approaches are quite different for machine printed and script. Therefore, prior to segmentation & classification, text distinction into Printed & script entries is mandatory. In this research, the dilemma of the language independent text distinction in multilingual data entry forms is addressed. Our main focus is to distinguish the machine printed text and script in multilingual data entry forms that are language independent. The proposed approach explore new statistical and structural features of text lines to classify them into separate categories. Accordingly a set of classification rules is derived to explicitly differentiate machine printed and handwritten entries, written in any language. Additional, novelty of the proposed approach is that no training/training data is required rather text is discriminated on basis of simple rules. Promising experimental results with 90 % accuracy exhibit that proposed approach is simple and robust. Finally, the scheme is independent of language, style, size, and fonts that commonly co-exist in multilingual data entry forms.
Keywords :
"Artificial intelligence","Robustness","Artificial neural networks","Character recognition","Image recognition"
Publisher :
ieee
Conference_Titel :
Evolving and Adaptive Intelligent Systems (EAIS), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/EAIS.2015.7368806
Filename :
7368806
Link To Document :
بازگشت