DocumentCode :
2142391
Title :
Keyword Spotting in Offline Chinese Handwritten Documents Using a Statistical Model
Author :
Huang, Liang ; Yin, Fei ; Chen, Qing-Hu ; Liu, Cheng-Lin
Author_Institution :
Sch. of Electron. Inf., Wuhan Univ., Wuhan, China
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
78
Lastpage :
82
Abstract :
This paper proposes a method for keyword spotting in offline Chinese handwritten documents using a statistical model. On a text query word, the method measures the similarity between the query word and every candidate word in the document by combining a character classifier and four classifiers characterizing the geometric contexts. By over-segmenting text lines into primitive segments, candidate characters and words are generated by concatenating consecutive segments, and the beam search strategy is used to search all the candidate words. The character classifier and the model combining weights are trained by optimizing a one-vs-all discrimination objective so as to maximize the similarity of true words and minimize the similarity of imposters. In experiments on a test dataset containing 1,015 pages of 180 writers, the proposed methods yields promising performance. For retrieving four-characer words, the recall, precision and F-measure are 92.47%, 83.76% and 87.90%, respectively.
Keywords :
document handling; pattern classification; query processing; beam search strategy; character classifier; consecutive segment concatenation; geometric context characterization; keyword spotting method; offline Chinese handwritten documents; one-vs-all discrimination objective; statistical model; text query word; Context; Context modeling; Feature extraction; Image segmentation; Prototypes; Support vector machines; Training; Chinese handwritten documents; Keyword spotting; statistical model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
ISSN :
1520-5363
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2011.25
Filename :
6065280
Link To Document :
بازگشت