• DocumentCode
    591952
  • Title

    Annotating handwritten characters with minimal human involvement in a semi-supervised learning strategy

  • Author

    Richarz, J. ; Vajda, Szilard ; Fink, Glenn A.

  • Author_Institution
    Fac. of Comput. Sci. XII, Tech. Univ. Dortmund, Dortmund, Germany
  • fYear
    2012
  • fDate
    18-20 Sept. 2012
  • Firstpage
    23
  • Lastpage
    28
  • Abstract
    One obstacle in the automatic analysis of handwritten documents is the huge amount of labeled data typically needed for classifier training. This is especially true when the document scans are of bad quality and different writers and writing styles have to be covered. Consequently, the considerable human effort required in the process currently prohibits the automatic transcription of large document collections. In this paper, two semi-supervised multiview learning approaches are presented, reducing the manual burden by robustly deriving a large number of labels from relatively few manual annotations. The first is based on cluster-level annotation followed by a majority decision, whereas the second casts the labeling process as a retrieval task and derives labels by voting among ranked lists. Both methods are thoroughly evaluated in a handwritten character recognition scenario using realistic document data. It is demonstrated that competitive recognition performance can be maintained by labeling only a fraction of the data.
  • Keywords
    document image processing; handwritten character recognition; image classification; image retrieval; learning (artificial intelligence); pattern clustering; classifier training; cluster-level annotation; competitive recognition performance; document collection; handwritten character annotation; handwritten character recognition; handwritten document analysis; human involvement; retrieval task; semisupervised multiview learning strategy; Character recognition; Handwriting recognition; Humans; Labeling; Manuals; Reliability; Training; document analysis; handwritten character recognition; multiview learning; semi-supervised annotation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
  • Conference_Location
    Bari
  • Print_ISBN
    978-1-4673-2262-1
  • Type

    conf

  • DOI
    10.1109/ICFHR.2012.181
  • Filename
    6424365