DocumentCode
2763528
Title
Segmentation of low-quality typewritten digits
Author
Rodriguez, C. ; Muguerza, J. ; Navarro, M. ; Zarate, A. ; Martin, J.I. ; Perez, J.M.
Author_Institution
Comput. Archit. & Technol. Dept., Basque Country Univ., Donostia, Spain
Volume
2
fYear
1998
fDate
16-20 Aug 1998
Firstpage
1106
Abstract
This work addresses the segmentation of numeric fields in forms presenting blurring, breaks and touching in digits. In an OCR system, the segmentation phase plays a determinant role in the global accuracy of the system. Segmentation is basically addressed from two approaches: (a) as an isolated phase in the OCR process, and (b) as interacting with the recognition of the segmented item. In this work, we have considered the first one in order to develop a robust new cost function combining vertical projection, Tsujimoto metric (1991) and background information. Unlike other techniques reported in the literature, ours obtains a near-optimum number of break points in fields containing broken, blurred and touching characters, leading to high accuracy in the global OCR system. Our experiments with a sample including about 11283 numeric fields in 144 forms (more than 50000 digits of that kind) show that 99.74% of fields have been correctly segmented. The new cost function only made 50 errors
Keywords
image segmentation; optical character recognition; OCR system; Tsujimoto metric; background information; blurring; low-quality typewritten digit segmentation; vertical projection; Character recognition; Computer architecture; Cost function; Electronic mail; Feature extraction; Image quality; Image segmentation; Optical character recognition software; Read only memory; Text recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on
Conference_Location
Brisbane, Qld.
ISSN
1051-4651
Print_ISBN
0-8186-8512-3
Type
conf
DOI
10.1109/ICPR.1998.711887
Filename
711887
Link To Document