Title :
Font identification — In context of an Indic script
Author :
Chanda, Sukalpa ; Pal, Umapada ; Franke, Katrin
Author_Institution :
Dept. of Comput. Sci. & Media Technol., Gjovik Univ. Coll., Gjovik, Norway
Abstract :
Font can be used as a notion of similarity amongst multiple documents written in same script. We could automatically retrieve document images with specific font from a huge digital document repository. So Optical Font Recognition could be a useful pre-processing step in an automated questioned document analysis system for sorting documents with similar fonts. We propose a scheme to identify 10 different fonts for an Indic script (Bangla). Curvature-based features are extracted from segmented characters and are fed to a Support Vector Machine (SVM) classifier. The classifier determines the font type for each segmented character obtained from a document. Later, font identification for that document is executed on the basis of majority voting amongst 10 different fonts for all characters. Using a Multiple Kernel SVM classifier we obtained 98.5% accuracy from 400 test documents (40 documents for each font type).
Keywords :
character sets; document image processing; feature extraction; image classification; image retrieval; image segmentation; information retrieval; optical character recognition; support vector machines; Bangla; Indic script; automated questioned document analysis; character segmentation; curvature-based feature extraction; digital document repository; document image retrieval; font identification; multikernel SVM classifier; optical font recognition; sorting document; support vector machine; Accuracy; Feature extraction; Image segmentation; Kernel; Pattern recognition; Support vector machines; Training; Bangla Script; Computational Forensics; Document Analysis; Font Identification; MKL SVM; SVM;
Conference_Titel :
Pattern Recognition (ICPR), 2012 21st International Conference on
Conference_Location :
Tsukuba
Print_ISBN :
978-1-4673-2216-4