DocumentCode :
3168022
Title :
Using OCR and equalization to downsample documents
Author :
Agazzi, O.E. ; Church, K.W. ; Gale, W.A.
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
Volume :
2
fYear :
1994
fDate :
9-13 Oct 1994
Firstpage :
305
Abstract :
Documents need to be sampled at different rates for different output devices: 300-600 dpi for laser printers, 100-200 dpi for fax, and 75-100 dpi for bitmap terminals. To output a high resolution document on a low resolution device, it may be necessary to introduce downsampling. Standard signal processing techniques such as linear filtering and decimation don´t work very well at low resolutions. Better results are obtained by a nonlinear filtering technique we introduce in this paper, called nonlinear document equalization. Even better results are obtained by taking advantage of fonts designed specifically for bitmap terminals and other low resolution devices. However, character-level information is required to make use of fonts. This information is not always available; OCR is not 100% accurate. We propose a hybrid approach: downsample by font substitution when possible, and decimate when necessary. Unfortunately, the result tends to look like a “ransom note”. Equalization is used to blend the two cases together so that gaps in the OCR analysis become almost unnoticeable
Keywords :
document image processing; OCR; bitmap terminals; character-level information; decimation; document downsampling; document sampling; font substitution; high-resolution document; nonlinear document equalization; nonlinear filtering technique; Art; Filtering; Gray-scale; Low pass filters; Maximum likelihood detection; Nonlinear filters; Optical character recognition software; Signal processing; Signal processing algorithms; Signal resolution;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 1994. Vol. 2 - Conference B: Computer Vision & Image Processing., Proceedings of the 12th IAPR International. Conference on
Conference_Location :
Jerusalem
Print_ISBN :
0-8186-6270-0
Type :
conf
DOI :
10.1109/ICPR.1994.576925
Filename :
576925
Link To Document :
بازگشت