• DocumentCode
    183350
  • Title

    Pixel Level Handwritten and Printed Content Discrimination in Scanned Documents

  • Author

    Seuret, Mathias ; Liwicki, Marcus ; Ingold, Rolf

  • Author_Institution
    Dept. of Inf., Univ. of Fribourg, Fribourg, Switzerland
  • fYear
    2014
  • fDate
    1-4 Sept. 2014
  • Firstpage
    423
  • Lastpage
    428
  • Abstract
    Classification of the content of a scanned document as either printed or handwritten is typically tackled as a segmentation problem of pages into text lines or words. However these methods are not applicable on documents where handwritten annotations overlay printed text. In this paper we propose to treat the task as a pixel classification task, i.e., To classify individual foreground pixels into either printed or handwritten pixels. Our method uses various features of diverse nature taking the surrounding window into account. The influence of the features and their parameters are investigated and optimized on a validation set. Each foreground pixel is then classified by a multilayer perceptron using feature vectors based on a pixel neighborhood. Finally, a post-processing step corrects typical misclassifications, i.e., It removes outliers based on several heuristics. We evaluated our method on printed documents with real handwritten annotations and reached an accuracy of 96.10% on the test set. This is significantly higher than a previously published methods based on local features.
  • Keywords
    document image processing; feature extraction; handwritten character recognition; image classification; image segmentation; multilayer perceptrons; text analysis; vectors; content classification; feature vectors; handwritten annotations; multilayer perceptron; page segmentation; pixel classification task; scanned documents; Accuracy; Feature extraction; Histograms; Image color analysis; Standards; Training; Vectors; classification; handwritings; handwritten; printed; recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
  • Conference_Location
    Heraklion
  • ISSN
    2167-6445
  • Print_ISBN
    978-1-4799-4335-7
  • Type

    conf

  • DOI
    10.1109/ICFHR.2014.77
  • Filename
    6981056