Title :
A Blind Indic Script Recognizer for Multi-script Documents
Author :
Pati, Peeta Basa ; Ramakrishnan, A.G.
Author_Institution :
Indian Inst. of Sci., Bangalore
Abstract :
We report a hierarchical blind script identifier for 11 different Indian scripts. An initial grouping of the 11 scripts is accomplished at the first level of this hierarchy. At the subsequent level, we recognize the script in each group. The various nodes of this tree use different feature-classifier combinations. A database of 20,000 words of different font styles and sizes is collected and used for each script. Effectiveness of Gabor and Discrete Cosine Transform features has been independently evaluated using nearest neighbor, linear discriminant and support vector machine classifiers. The minimum and maximum accuracies obtained, using this hierarchical mechanism, are 92.2% and 97.6%, respectively.
Keywords :
discrete cosine transforms; document image processing; image classification; natural language processing; support vector machines; Gabor transform; blind Indie script recognizer; discrete cosine transform; feature-classifier combinations; linear discriminant classifiers; multi-script documents; nearest neighbor classifiers; support vector machine classifiers; Biomedical imaging; Discrete cosine transforms; Filter bank; Frequency; Gabor filters; Laboratories; Nearest neighbor searches; Spatial databases; Support vector machine classification; Support vector machines;
Conference_Titel :
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location :
Parana
Print_ISBN :
978-0-7695-2822-9
DOI :
10.1109/ICDAR.2007.4377115