DocumentCode :
3141284
Title :
Multifont classification using typographical attributes
Author :
Jung, Min-Chul ; Shin, Yong-Chul ; Srihari, Sargur N.
Author_Institution :
Center of Excellence for Document Analysis Recognition, State Univ. of New York, Buffalo, NY, USA
fYear :
1999
fDate :
20-22 Sep 1999
Firstpage :
353
Lastpage :
356
Abstract :
This paper introduces a multifont classification scheme to help with the recognition of multifont and multisize characters. It uses typographical attributes such as ascenders, descenders and serifs obtained from a word image. The attributes are used as an input to a neural network classifier to produce the multifont classification results. It can classify 7 commonly used fonts for all point sizes from 7 to 18. The approach developed in this scheme can handle a wide range of image quality even with severely touching characters. The detection of the font can improve character segmentation as well as character recognition because the identification of the font provides information on the structure and typographical design of characters. Therefore, this multifont classification algorithm can be used for maintaining good recognition rates of a machine printed OCR system regardless of fonts and sizes. Experiments have shown that font classification accuracies reach high performance levels of about 95 percent even with severely touching characters. The technique developed for the selected 7 fonts in this paper can be applied to any other fonts
Keywords :
character sets; document image processing; image classification; image segmentation; neural nets; optical character recognition; ascenders; character recognition; character segmentation; descenders; font detection; font identification; image quality; machine printed OCR system; multifont character recognition; multifont classification; multisize character recognition; neural network classifier; serifs; severely touching characters; typographical attributes; typographical design; word image; Character recognition; Electronic switching systems; Image quality; Image segmentation; Optical character recognition software; Read only memory; Shape; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
Type :
conf
DOI :
10.1109/ICDAR.1999.791797
Filename :
791797
Link To Document :
بازگشت