DocumentCode
2038751
Title
Bilingual OCR system for printed documents in Malayalam and English
Author
Rahiman, M.A. ; Adheena, C.V. ; Anitha, R. ; Deepa, N. ; Kumar, G. Manoj ; Rajasree, M.S.
Author_Institution
Karpagam Univ., Coimbatore, India
Volume
3
fYear
2011
fDate
8-10 April 2011
Firstpage
40
Lastpage
45
Abstract
India is a multilingual and multi-script country where a line of a bilingual document page may contain text words both in regional language and in English. Recognition of documents containing multi-scripts is really a challenging task, which needs more effort of the OCR designers for improving the accuracy rate. This paper presents a Bilingual OCR system for printed Malayalam and English text. Here we propose an algorithm which can accept scanned image of printed characters as input and produce editable Malayalam and English characters in a predefined format as output. The image acquired is segmented into line and character-wise using pixel by pixel approach by scanning from top-left of the image to bottom-right. The character image obtained after segmentation is resized to 16 × 16 bitmap which is used for comparison. The database contains characters in various fonts of both the languages. This database is used for comparison with the resized character image. The comparison is done using pixel-match algorithm. The matched character is displayed in the notepad. An efficiency of 87.25% is obtained using this approach.
Keywords
document image processing; optical character recognition; English; Malayalam; bilingual OCR system; bilingual document; character image; documents recognition; image segmentation; pixel by pixel approach; printed documents; Character recognition; Databases; Feature extraction; Image segmentation; Optical character recognition software; Optical imaging; Pixel; Bilingual OCR; Feature Extraction; Handwritten characters; Malayalam; Optical Character Recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Electronics Computer Technology (ICECT), 2011 3rd International Conference on
Conference_Location
Kanyakumari
Print_ISBN
978-1-4244-8678-6
Electronic_ISBN
978-1-4244-8679-3
Type
conf
DOI
10.1109/ICECTECH.2011.5941797
Filename
5941797
Link To Document