DocumentCode :
2038751
Title :
Bilingual OCR system for printed documents in Malayalam and English
Author :
Rahiman, M.A. ; Adheena, C.V. ; Anitha, R. ; Deepa, N. ; Kumar, G. Manoj ; Rajasree, M.S.
Author_Institution :
Karpagam Univ., Coimbatore, India
Volume :
3
fYear :
2011
fDate :
8-10 April 2011
Firstpage :
40
Lastpage :
45
Abstract :
India is a multilingual and multi-script country where a line of a bilingual document page may contain text words both in regional language and in English. Recognition of documents containing multi-scripts is really a challenging task, which needs more effort of the OCR designers for improving the accuracy rate. This paper presents a Bilingual OCR system for printed Malayalam and English text. Here we propose an algorithm which can accept scanned image of printed characters as input and produce editable Malayalam and English characters in a predefined format as output. The image acquired is segmented into line and character-wise using pixel by pixel approach by scanning from top-left of the image to bottom-right. The character image obtained after segmentation is resized to 16 × 16 bitmap which is used for comparison. The database contains characters in various fonts of both the languages. This database is used for comparison with the resized character image. The comparison is done using pixel-match algorithm. The matched character is displayed in the notepad. An efficiency of 87.25% is obtained using this approach.
Keywords :
document image processing; optical character recognition; English; Malayalam; bilingual OCR system; bilingual document; character image; documents recognition; image segmentation; pixel by pixel approach; printed documents; Character recognition; Databases; Feature extraction; Image segmentation; Optical character recognition software; Optical imaging; Pixel; Bilingual OCR; Feature Extraction; Handwritten characters; Malayalam; Optical Character Recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronics Computer Technology (ICECT), 2011 3rd International Conference on
Conference_Location :
Kanyakumari
Print_ISBN :
978-1-4244-8678-6
Electronic_ISBN :
978-1-4244-8679-3
Type :
conf
DOI :
10.1109/ICECTECH.2011.5941797
Filename :
5941797
Link To Document :
بازگشت