DocumentCode :
591956
Title :
Handwritten and Machine Printed Text Separation in Document Images Using the Bag of Visual Words Paradigm
Author :
Zagoris, Konstantinos ; Pratikakis, Ioannis ; Antonacopoulos, A. ; Gatos, Basilis ; Papamarkos, Nikolaos
Author_Institution :
Sch. of Comput., Univ. of Salford, Salford, UK
fYear :
2012
fDate :
18-20 Sept. 2012
Firstpage :
103
Lastpage :
108
Abstract :
In a number of types of documents, ranging from forms to archive documents and books with annotations, machine printed and handwritten text may be present in the same document image, giving rise to significant issues within a digitisation and recognition pipeline. It is therefore necessary to separate the two types of text before applying different recognition methodologies to each. In this paper, a new approach is proposed which strives towards identifying and separating handwritten from machine printed text using the Bag of Visual Words paradigm (BoVW). Initially, blocks of interest are detected in the document image. For each block, a descriptor is calculated based on the BoVW. The final characterization of the blocks as Handwritten, Machine Printed or Noise is made by a Support Vector Machine classifier. The promising performance of the proposed approach is shown by using a consistent evaluation methodology which couples meaningful measures along with a new dataset.
Keywords :
document image processing; handwritten character recognition; image classification; support vector machines; text analysis; BoVW; annotation; archive document; bag of visual words paradigm; book; descriptor; digitisation pipeline; document image; document type; handwritten text separation; machine printed text separation; recognition methodology; recognition pipeline; support vector machine classifier; Gabor filters; Handwriting recognition; Hidden Markov models; Image segmentation; Noise; Text recognition; Visualization; SIFT; bag of visual words; support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location :
Bari
Print_ISBN :
978-1-4673-2262-1
Type :
conf
DOI :
10.1109/ICFHR.2012.207
Filename :
6424377
Link To Document :
بازگشت