• DocumentCode
    2205575
  • Title

    Automatic Transcription of Handwritten Medieval Documents

  • Author

    Fischer, Andreas ; Wüthrich, Markus ; Liwicki, Marcus ; Frinken, Volkmar ; Bunke, Horst ; Viehhauser, Gabriel ; Stolz, Michael

  • Author_Institution
    Inst. of Comput. Sci. & Appl. Math., Univ. of Bern, Bern, Switzerland
  • fYear
    2009
  • fDate
    9-12 Sept. 2009
  • Firstpage
    137
  • Lastpage
    142
  • Abstract
    The automatic transcription of historical documents is vital for the creation of digital libraries. In order to make images of valuable old documents amenable to browsing, a transcription of high accuracy is needed. In this paper, two state-of-the art recognizers originally developed for modern scripts are applied to medieval documents. The first is based on Hidden Markov Models and the second uses a Neural Network with a bidirectional Long Short-Term Memory. On a dataset of word images extracted from a medieval manuscript of the 13th century, written in Middle High German by several writers, it is demonstrated that a word accuracy of 93.32% is achievable. This is far above the word accuracy of 77.12% achieved with the same recognizers for unconstrained modern scripts written in English. These results encourage the development of real world systems for automatic transcription of historical documents with a view to image and text browsing in digital libraries.
  • Keywords
    digital libraries; document image processing; feature extraction; handwriting recognition; hidden Markov models; history; neural nets; English; automatic transcription; bidirectional long short-term memory; digital libraries; handwriting recognition; handwritten medieval documents; hidden Markov models; historical documents; image browsing; middle high German; neural network; real world systems; text browsing; unconstrained modern scripts; word accuracy; word image extraction; Artificial intelligence; Computer science; Handwriting recognition; Hidden Markov models; Mathematics; Multimedia systems; Neural networks; Software libraries; Vocabulary; Writing; Computer Vision for Cultural Heritage;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Virtual Systems and Multimedia, 2009. VSMM '09. 15th International Conference on
  • Conference_Location
    Vienna
  • Print_ISBN
    978-0-7695-3790-0
  • Type

    conf

  • DOI
    10.1109/VSMM.2009.26
  • Filename
    5306020