DocumentCode
2447803
Title
Automatic segmentation of the IAM off-line database for handwritten English text
Author
Zimmermann, Matthias ; Bunke, Horst
Author_Institution
Inst. of Informatics & Appl. Math., Bern Univ., Switzerland
Volume
4
fYear
2002
fDate
2002
Firstpage
35
Abstract
Presents an automatic segmentation scheme for cursive handwritten text lines using the transcriptions of the text lines and a hidden Markov model (HMM) based recognition system. The segmentation scheme has been developed and tested on the IAM database that contains offline images of cursively handwritten English text. The original version of this database contains ground truth for complete lines of text only, but not for individual words. With the method described in the paper the usability of the database is greatly improved because accurate bounding box information and ground truth for individual words (including punctuation characters) is now available as well. Applying the segmentation scheme on 417 pages of handwritten text a correct word segmentation rate of 98% has been achieved, producing correct bounding boxes for over 25,000 handwritten words.
Keywords
handwritten character recognition; hidden Markov models; image segmentation; IAM off-line database; accurate bounding box information; automatic segmentation; cursive handwritten text lines; ground truth; handwritten English text; hidden Markov model based recognition system; individual words; punctuation characters; transcriptions; Character recognition; Handwriting recognition; Hidden Markov models; Image databases; Image segmentation; Informatics; Mathematics; System testing; Text recognition; Usability;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 2002. Proceedings. 16th International Conference on
ISSN
1051-4651
Print_ISBN
0-7695-1695-X
Type
conf
DOI
10.1109/ICPR.2002.1047394
Filename
1047394
Link To Document