DocumentCode
3487845
Title
Sparse Document Image Coding for Restoration
Author
Kumar, Vipin ; Bansal, Ankur ; Tulsiyan, Goutam Hari ; Mishra, Anadi ; Namboodiri, Anoop ; Jawahar, C.V.
Author_Institution
Center for Visual Inf. Technol., IIIT Hyderabad, Hyderabad, India
fYear
2013
fDate
25-28 Aug. 2013
Firstpage
713
Lastpage
717
Abstract
Sparse representation based image restoration techniques have shown to be successful in solving various inverse problems such as denoising, in painting, and super-resolution, etc. on natural images and videos. In this paper, we explore the use of sparse representation based methods specifically to restore the degraded document images. While natural images form a very small subset of all possible images admitting the possibility of sparse representation, document images are significantly more restricted and are expected to be ideally suited for such a representation. However, the binary nature of textual document images makes dictionary learning and coding techniques unsuitable to be applied directly. We leverage the fact that different characters possess similar strokes, curves, and edges, and learn a dictionary that gives sparse decomposition for patches. Experimental results show significant improvement in image quality and OCR performance on documents collected from a variety of sources such as magazines and books. This method is therefore, ideally suited for restoring highly degraded images in repositories such as digital libraries.
Keywords
document image processing; image coding; image representation; image restoration; learning (artificial intelligence); text analysis; OCR performance; degraded document image restoration; dictionary learning; image quality; natural images; sparse decomposition; sparse document image coding; sparse representation based image restoration techniques; textual document images; Degradation; Dictionaries; Image coding; Image restoration; Noise; Noise measurement; Optical character recognition software; Dictionary learning; Document restoration; Sparse representation;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location
Washington, DC
ISSN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2013.146
Filename
6628711
Link To Document