DocumentCode :
2142492
Title :
A Handwritten Character Extraction Algorithm for Multi-language Document Image
Author :
Song, Yonghong ; Xiao, Guilin ; Zhang, Yuanlin ; Yang, Lei ; Zhao, Liuliu
Author_Institution :
Inst. of Artificial Intell. & Robot., Xi´´an Jiaotong Univ., Xi´´an, China
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
93
Lastpage :
98
Abstract :
In this paper, we propose a novel method for extracting handwritten characters from multi-language document images, which may contain various types of characters, e.g. Chinese, English, Japanese or their mixture. Firstly, text patches in document image are segmented based on connected component analysis. Rules for merging connected components are chosen according to the results of language identification. Then features are extracted for each basic analysis unit-text patch. Genetic algorithm is applied for feature fusion and patch type classification. Finally, a Markov Random Field model is utilized as a post-processing step to further correct the misclassification of text patch type by considering the document context. Experimental results show that the proposed algorithm can apparently improve the performance of handwritten character extraction.
Keywords :
Markov processes; document image processing; genetic algorithms; handwritten character recognition; image classification; Markov random field model; connected component analysis; feature fusion; genetic algorithm; handwritten character extraction; language identification; multilanguage document image; patch type classification; unit-text patch; Feature extraction; Genetic algorithms; Image segmentation; Markov random fields; Merging; Text analysis; Vectors; Markov random field; document segmentation; feature fusion; handwritten character extraction; multi-language;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
ISSN :
1520-5363
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2011.28
Filename :
6065283
Link To Document :
بازگشت