DocumentCode
3142953
Title
A full English sentence database for off-line handwriting recognition
Author
Marti, U.-V. ; Bunke, H.
Author_Institution
Inst. fur Inf., Bern Univ., Switzerland
fYear
1999
fDate
20-22 Sep 1999
Firstpage
705
Lastpage
708
Abstract
We present a new database for off-line handwriting recognition, together with a few preprocessing and text segmentation procedures. The database is based on the Lancaster-Oslo/Bergen(LOB) corpus. This corpus is a collection of tests that were used to generate forms, which subsequently were filled out by persons in their own handwriting. As of December 1998 the database includes 556 forms produced by approximately 250 different writers. The database consists of full English sentences. It could serve as a basis for a variety of handwriting recognition tasks. The main focus, however is on recognition techniques that use linguistic knowledge beyond the lexicon level. This knowledge can be automatically derived from the corpus or it can be supplied from external sources
Keywords
computational linguistics; handwriting recognition; English sentence database; Lancaster-Oslo/Bergen corpus; linguistic knowledge; off-line handwriting recognition; preprocessing; text segmentation; Character recognition; Databases; Handwriting recognition; Informatics; Mathematics; NIST; Optical character recognition software; Read only memory; Speech recognition; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location
Bangalore
Print_ISBN
0-7695-0318-7
Type
conf
DOI
10.1109/ICDAR.1999.791885
Filename
791885
Link To Document