DocumentCode :
595035
Title :
Font identification — In context of an Indic script
Author :
Chanda, Sukalpa ; Pal, Umapada ; Franke, Katrin
Author_Institution :
Dept. of Comput. Sci. & Media Technol., Gjovik Univ. Coll., Gjovik, Norway
fYear :
2012
fDate :
11-15 Nov. 2012
Firstpage :
1655
Lastpage :
1658
Abstract :
Font can be used as a notion of similarity amongst multiple documents written in same script. We could automatically retrieve document images with specific font from a huge digital document repository. So Optical Font Recognition could be a useful pre-processing step in an automated questioned document analysis system for sorting documents with similar fonts. We propose a scheme to identify 10 different fonts for an Indic script (Bangla). Curvature-based features are extracted from segmented characters and are fed to a Support Vector Machine (SVM) classifier. The classifier determines the font type for each segmented character obtained from a document. Later, font identification for that document is executed on the basis of majority voting amongst 10 different fonts for all characters. Using a Multiple Kernel SVM classifier we obtained 98.5% accuracy from 400 test documents (40 documents for each font type).
Keywords :
character sets; document image processing; feature extraction; image classification; image retrieval; image segmentation; information retrieval; optical character recognition; support vector machines; Bangla; Indic script; automated questioned document analysis; character segmentation; curvature-based feature extraction; digital document repository; document image retrieval; font identification; multikernel SVM classifier; optical font recognition; sorting document; support vector machine; Accuracy; Feature extraction; Image segmentation; Kernel; Pattern recognition; Support vector machines; Training; Bangla Script; Computational Forensics; Document Analysis; Font Identification; MKL SVM; SVM;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2012 21st International Conference on
Conference_Location :
Tsukuba
ISSN :
1051-4651
Print_ISBN :
978-1-4673-2216-4
Type :
conf
Filename :
6460465
Link To Document :
بازگشت