Title :
OCR-based rate-distortion analysis of residual coding
Author :
Kia, Omid E. ; Doermann, David S.
Author_Institution :
Lab. of Inf. Technol., Nat. Inst. of Stand. & Technol., Gaithersburg, MD, USA
Abstract :
Symbolic compression of document images provides access to symbols found in document images and exploits the redundancy found within them. Document images are highly structured and contain large numbers of repetitive symbols. We have shown that while symbolically compressing a document image we are able to perform compressed-domain processing. Symbolic compression forms representative prototypes for symbols and encode the image by the location of these prototypes and a residual (the difference between symbol and prototype). We analyze the rate-distortion tradeoff by varying the amount of residual used in compression for both distance- and row-order coding. A measure of distortion is based on the performance of an OCR system on the resulting image. The University of Washington document database images, ground truth, and OCR evaluation software are used for experiments
Keywords :
data compression; document image processing; image coding; image representation; optical character recognition; rate distortion theory; OCR evaluation software; OCR system performance; University of Washington; compressed-domain processing; distance-order coding; distortion measure; document database images; document images; experiments; ground truth; image coding; lossy compression; lossy representation; progressive transmission; rate-distortion analysis; redundancy; representative prototypes; residual coding; row-order coding; symbolic compression; Distortion measurement; Image coding; Image recognition; Laboratories; Optical character recognition software; Performance analysis; Pixel; Propagation losses; Prototypes; Rate-distortion;
Conference_Titel :
Image Processing, 1997. Proceedings., International Conference on
Conference_Location :
Santa Barbara, CA
Print_ISBN :
0-8186-8183-7
DOI :
10.1109/ICIP.1997.632215