Title :
Filtering in Chinese document images based on templates and confidence measure
Author :
Jiewei, Chen ; Weiran, Xu ; Jun, Guo
Author_Institution :
Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., China
fDate :
31 Aug.-4 Sept. 2004
Abstract :
A fast approach to keyword spotting in Chinese document images based on multiple templates matching and confidence measure is presented. The system generates keyword lexicon of diverse fonts and two-stage feature vectors prior to the procedure of keyword searching. A two-stage retrieval scheme and Boyer-Moore Algorithm is proposed aiming at accelerating the retrieval process. A distance measure between the candidate character and the templates is used to identify and rank similar templates. The performance of new system has been significantly improved when compared to traditional OCR and image-based approach. Experimental results confirmed the robust of the proposed approach over a wide range of degradations.
Keywords :
character recognition; document image processing; feature extraction; image matching; image retrieval; information filtering; natural languages; Boyer-Moore algorithm; Chinese document image filter; candidate character; confidence measure; information filtering; keyword lexicon; multiple template matching; two-stage feature vector; two-stage retrieval scheme; Acceleration; Character recognition; Degradation; Image recognition; Image retrieval; Image segmentation; Information filtering; Information retrieval; Optical character recognition software; Testing;
Conference_Titel :
Signal Processing, 2004. Proceedings. ICSP '04. 2004 7th International Conference on
Print_ISBN :
0-7803-8406-7
DOI :
10.1109/ICOSP.2004.1441582