Title :
An extension matrix approach to Chinese character recognition
Author :
Shu, Wenhao ; Shi, Darning ; Qian, Guolian ; Wang, Fusi
Author_Institution :
Hong Kong Polytech. Univ., Hung Hom, China
Abstract :
Optical character recognition (OCR) provides a solution to acquire, archive and retrieve a large amount of paper-based information which is still commonly used in our daily life. The process of a classical optical character recognition system consists of a series of stages, such as format analysis, text segmentation, feature extraction and classification. This paper focuses on the last two stages, and two contributions can be claimed: first, rapid transformed stroke density features (SDF) are used for preliminary classification and outline primitive structural features for final classification. Second, the original extension matrix algorithm is improved by heuristic path searching on the basis of information entropy as well as Laplace error rate evaluation function. Our experimental results prove that the rapid transformed SDFs are insensitive to image translation or rotation, and that the improved extension matrix algorithm outperforms other inductive approaches based on AE1 and AQ15. The excellent performance with respect to a large data set also indicates our proposed approach is effective and efficient
Keywords :
document image processing; feature extraction; image classification; learning by example; optical character recognition; search problems; AE1; AQ15; Chinese character recognition; Laplace error rate evaluation function; OCR; classification; experimental results; extension matrix approach; feature extraction; format analysis; heuristic path searching; image rotation; image translation; inductive learning; information entropy; large data set; optical character recognition; primitive structural features; stroke density features; text segmentation; Character recognition; Error analysis; Feature extraction; Heuristic algorithms; Information entropy; Learning systems; Optical character recognition software; Optical distortion; Paper technology; Pixel;
Conference_Titel :
Systems, Man, and Cybernetics, 2000 IEEE International Conference on
Conference_Location :
Nashville, TN
Print_ISBN :
0-7803-6583-6
DOI :
10.1109/ICSMC.2000.884415