• DocumentCode
    2447803
  • Title

    Automatic segmentation of the IAM off-line database for handwritten English text

  • Author

    Zimmermann, Matthias ; Bunke, Horst

  • Author_Institution
    Inst. of Informatics & Appl. Math., Bern Univ., Switzerland
  • Volume
    4
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    35
  • Abstract
    Presents an automatic segmentation scheme for cursive handwritten text lines using the transcriptions of the text lines and a hidden Markov model (HMM) based recognition system. The segmentation scheme has been developed and tested on the IAM database that contains offline images of cursively handwritten English text. The original version of this database contains ground truth for complete lines of text only, but not for individual words. With the method described in the paper the usability of the database is greatly improved because accurate bounding box information and ground truth for individual words (including punctuation characters) is now available as well. Applying the segmentation scheme on 417 pages of handwritten text a correct word segmentation rate of 98% has been achieved, producing correct bounding boxes for over 25,000 handwritten words.
  • Keywords
    handwritten character recognition; hidden Markov models; image segmentation; IAM off-line database; accurate bounding box information; automatic segmentation; cursive handwritten text lines; ground truth; handwritten English text; hidden Markov model based recognition system; individual words; punctuation characters; transcriptions; Character recognition; Handwriting recognition; Hidden Markov models; Image databases; Image segmentation; Informatics; Mathematics; System testing; Text recognition; Usability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2002. Proceedings. 16th International Conference on
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-1695-X
  • Type

    conf

  • DOI
    10.1109/ICPR.2002.1047394
  • Filename
    1047394