• DocumentCode
    183426
  • Title

    Automatic Line Segmentation and Ground-Truth Alignment of Handwritten Documents

  • Author

    Bluche, Theodore ; Moysset, Bastien ; Kermorvant, Christopher

  • Author_Institution
    A2iA SA, Paris, France
  • fYear
    2014
  • fDate
    1-4 Sept. 2014
  • Firstpage
    667
  • Lastpage
    672
  • Abstract
    In this paper, we present a method for the automatic segmentation and transcript alignment of documents, for which we only have the transcript at the document level. We consider several line segmentation hypotheses, and recognition hypotheses for each segmented line. The recognition is highly constrained with the document transcript. We formalize the problem in a weighted finite-state transducer framework. We evaluate how the constraints help achieve a reasonable result. In particular, we assess the performance of the system both in terms of segmentation quality and transcript mapping. The main contribution of this paper is that we jointly find the best segmentation and transcript mapping that allow to align the image with the whole ground-truth text. The evaluation is carried out on fully annotated public databases. Furthermore, we retrieved training material with this system for the Maurdor evaluation, where the data was only annotated at the paragraph level. With the automatically segmented and annotated lines, we record a relative improvement in Word Error Rate of 35.6%.
  • Keywords
    document image processing; finite state machines; handwriting recognition; handwritten character recognition; image segmentation; Maurdor evaluation; automatic line segmentation; ground-truth alignment; handwritten document; segmentation quality; transcript alignment; transcript mapping; weighted finite-state transducer framework; word error rate; Databases; Feature extraction; Hidden Markov models; Image segmentation; Lattices; Optical character recognition software; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
  • Conference_Location
    Heraklion
  • ISSN
    2167-6445
  • Print_ISBN
    978-1-4799-4335-7
  • Type

    conf

  • DOI
    10.1109/ICFHR.2014.117
  • Filename
    6981096