Title :
A Malayalam OCR System Using Column-Stochastic Image Matrix Approach
Author :
Philip, Bindu ; Samuel, R. D Sudhaker
Author_Institution :
Dept. of Electron. & Commun., S.J. Coll. of Eng., Mysore, India
Abstract :
Indian languages especially South Indian languages have several distinct characteristics that are exploited for the development of a robust optical character recognition system (OCR). This paper addresses the problem of segmentation of printed Malayalam characters, a fairly complex task, along with their characterization through non-trivial dominant Eigen values of column-stochastic image matrices. Rectangular image matrices obtained after digitalization, segmentation and normalization are converted to column-stochastic square matrices. Non trivial dominant Eigen values of such matrices have proved to be unique for characterization of printed Malayalam characters. Further, a novel segmentation algorithm has been proposed and tested. Results and analysis presented indicate effective performance of the OCR system.
Keywords :
eigenvalues and eigenfunctions; image segmentation; matrix algebra; optical character recognition; Malayalam OCR system; South Indian languages; column-stochastic image matrix approach; dominant eigenvalues; printed Malayalam characters segmentation; rectangular image matrices; robust optical character recognition system; Character recognition; Communications technology; Educational institutions; Feature extraction; Image segmentation; Matrix converters; Optical character recognition software; Optical computing; Support vector machine classification; Support vector machines; Column-stochastic image matrix; Hierarchical classifier; Malayalam optical character recognition; Segmentation;
Conference_Titel :
Advances in Recent Technologies in Communication and Computing, 2009. ARTCom '09. International Conference on
Conference_Location :
Kottayam, Kerala
Print_ISBN :
978-1-4244-5104-3
Electronic_ISBN :
978-0-7695-3845-7
DOI :
10.1109/ARTCom.2009.146