Title :
A Japanese OCR post-processing approach based on dictionary matching
Author :
Chu-Yu Guo ; Yuan-Yan Tang ; Chang-Song Liu ; Jia Duan
Author_Institution :
Dept. of Comput. & Inf. Sci., Univ. of Macau, Macau, China
Abstract :
This paper describes a post-processing approach for Japanese character recognition based on dictionary. By the analysis of experimental data in the processing of OCR, we find that some segmentation and recognition results do not conform to the rules of lexical and just generate the character based on the shape. If the fonts of pending recognized characters are similar with the others, it will easily lead to going wrong in the processing of OCR. For these errors we put forward an idea based on the Limited Length Segmentation Matching and the Bayesian Statistical Classifier. Through the above method, most of the font recognized mistakes can be solved. By the experimental results, it can be proved that this method is an effective way to improve the recognized rate of Japanese character.
Keywords :
Bayes methods; data analysis; dictionaries; image classification; image matching; image segmentation; natural language processing; optical character recognition; statistical analysis; Bayesian statistical classifier; Japanese OCR postprocessing approach; Japanese optical character recognition; dictionary matching; experimental data analysis; limited length segmentation matching; Abstracts; Bayes methods; Image segmentation; Optical character recognition software; Bayesian Theory; Dictionary Matching; Japanese Character; Limited Length Segmentation; OCR;
Conference_Titel :
Wavelet Analysis and Pattern Recognition (ICWAPR), 2013 International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-1-4799-0415-0
DOI :
10.1109/ICWAPR.2013.6599286