DocumentCode :
2447803
Title :
Automatic segmentation of the IAM off-line database for handwritten English text
Author :
Zimmermann, Matthias ; Bunke, Horst
Author_Institution :
Inst. of Informatics & Appl. Math., Bern Univ., Switzerland
Volume :
4
fYear :
2002
fDate :
2002
Firstpage :
35
Abstract :
Presents an automatic segmentation scheme for cursive handwritten text lines using the transcriptions of the text lines and a hidden Markov model (HMM) based recognition system. The segmentation scheme has been developed and tested on the IAM database that contains offline images of cursively handwritten English text. The original version of this database contains ground truth for complete lines of text only, but not for individual words. With the method described in the paper the usability of the database is greatly improved because accurate bounding box information and ground truth for individual words (including punctuation characters) is now available as well. Applying the segmentation scheme on 417 pages of handwritten text a correct word segmentation rate of 98% has been achieved, producing correct bounding boxes for over 25,000 handwritten words.
Keywords :
handwritten character recognition; hidden Markov models; image segmentation; IAM off-line database; accurate bounding box information; automatic segmentation; cursive handwritten text lines; ground truth; handwritten English text; hidden Markov model based recognition system; individual words; punctuation characters; transcriptions; Character recognition; Handwriting recognition; Hidden Markov models; Image databases; Image segmentation; Informatics; Mathematics; System testing; Text recognition; Usability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2002. Proceedings. 16th International Conference on
ISSN :
1051-4651
Print_ISBN :
0-7695-1695-X
Type :
conf
DOI :
10.1109/ICPR.2002.1047394
Filename :
1047394
Link To Document :
بازگشت