DocumentCode :
1581750
Title :
Separating handwritten material from machine printed text using hidden Markov models
Author :
Guo, Jinhong K. ; Ma, Matthew Y.
Author_Institution :
Panasonic Inf. & Networking Technols. Lab., Princeton, NJ, USA
fYear :
2001
fDate :
6/23/1905 12:00:00 AM
Firstpage :
439
Lastpage :
443
Abstract :
In this paper, we address the problem of separating handwritten annotations from machine-printed text within a document. We present an algorithm that is based on the theory of hidden Markov models (HMMs) to distinguish between machine-printed and handwritten materials. No OCR results are required prior to or during the process, and the classification is performed at the word level. Handwritten annotations are not limited to marginal areas, as the approach can deal with document images having handwritten annotations overlaid on machine-printed text and it has been shown to be promising in our experiments. Experimental results show that the proposed method can achieve 72.19% recall for fully extracted handwritten words and 90.37% for partially extracted words. The precision of extracting handwritten words has reached 92.86%
Keywords :
document image processing; handwriting recognition; hidden Markov models; document images; document text separation; handwritten annotations; handwritten words extraction; hidden Markov models; machine-printed text; precision; recall; word-level classification; Data mining; Engines; Handwriting recognition; Hidden Markov models; Image coding; Instruments; Laboratories; Neural networks; Optical character recognition software; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
Type :
conf
DOI :
10.1109/ICDAR.2001.953828
Filename :
953828
Link To Document :
بازگشت