DocumentCode
1583303
Title
A real-world evaluation of a generic document recognition method applied to a military form of the 19th century
Author
Coüasnon, Bertrand ; Pasquer, Laurent
Author_Institution
INSA-Dept. Inf., IRISA, Rennes, France
fYear
2001
fDate
6/23/1905 12:00:00 AM
Firstpage
779
Lastpage
783
Abstract
In this paper we present a real-world evaluation of DMOS, a new generic document recognition method. This method uses a new grammatical formalism (EPF) and an associated parser able to introduce context in segmentation. We have implemented this DMOS method to build an automatic generator of structured document recognition systems. We already produced three recognition systems by only changing the EPF grammar: one on musical scores, one on mathematical formulae and one on recursive table structures. We present here a specific light grammar to automatically recognize quite damaged 19th century military forms. The quality of those forms is far from perfect: table lines are not well printed, paper is so thin that there are transparency problems (the forms are two-sided) but the biggest problem comes from small paper sheets hiding part of the structure. The evaluation of this system has been made onto 5268 images and the results show that the system did not make any mistake. Moreover it can recognize the entire structure in 97.2% of the forms (the other 2.8% are automatically set apart)
Keywords
character recognition; grammars; image segmentation; DMOS; EPF; generic document recognition method; grammatical formalism; light grammar; mathematical formulae; military form; musical scores; parser; real-world evaluation; recursive table structures; segmentation; structured document recognition systems; Image segmentation; Machine assisted indexing; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location
Seattle, WA
Print_ISBN
0-7695-1263-1
Type
conf
DOI
10.1109/ICDAR.2001.953894
Filename
953894
Link To Document