مرکز منطقه ای اطلاع رساني علوم و فناوري - An OCR system for printed Kannada using k-means clustering

DocumentCode :

2472098

Title :

An OCR system for printed Kannada using k-means clustering

Author :

Sheshadri, Karthik ; Ambekar, Pavan Kumar T ; Prasad, Deeksha Padma ; Kumar, Ramakanth P.

Author_Institution :

Dept. of Comput. Sci., Rashtreeya Vidyalaya Coll. of Eng., Bangalore, India

fYear :

2010

fDate :

14-17 March 2010

Firstpage :

183

Lastpage :

187

Abstract :

We address the problem of Kannada character recognition, and propose a recognition mechanism based on k-means clustering. The large dataset of Kannada characters and their similarity makes the problem one order of magnitude more difficult than for a standard language like English. We propose a segmentation technique to decompose each character into components from 3 base classes, thus reducing the magnitude of the problem. k-means provides a natural degree of font independence and this is used to reduce the size of the training database to about a tenth of those used in related work. Consequently, recognition proceeds an order of magnitude faster. We present accuracy comparisons with related work, showing the proposed method to yield a better peak accuracy. We also discuss the relative merits of probabilistic and geometric seeding in k-means.

Keywords :

geometry; image segmentation; optical character recognition; pattern clustering; probability; OCR system; geometric seeding; k-means clustering; printed Kannada character recognition; probabilistic seeding; segmentation technique; Brightness; Character recognition; Computer science; Gray-scale; Image converters; Image databases; Image segmentation; Optical character recognition software; Pixel; Spatial databases;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Industrial Technology (ICIT), 2010 IEEE International Conference on

Conference_Location :

Vi a del Mar

Print_ISBN :

978-1-4244-5695-6

Electronic_ISBN :

978-1-4244-5696-3

Type :

conf

DOI :

10.1109/ICIT.2010.5472676

Filename :

5472676

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2472098