DocumentCode
83037
Title
Robust Document Image Binarization Technique for Degraded Document Images
Author
Bolan Su ; Shijian Lu ; Chew Lim Tan
Author_Institution
Sch. of Comput., Nat. Univ. of Singapore, Singapore, Singapore
Volume
22
Issue
4
fYear
2013
fDate
Apr-13
Firstpage
1408
Lastpage
1417
Abstract
Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intra-variation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the local image contrast and the local image gradient that is tolerant to text and background variation caused by different types of document degradations. In the proposed technique, an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized and combined with Canny´s edge map to identify the text stroke edge pixels. The document text is further segmented by a local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. The proposed method is simple, robust, and involves minimum parameter tuning. It has been tested on three public datasets that are used in the recent document image binarization contest (DIBCO) 2009 & 2011 and handwritten-DIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and 92.03%, respectively, that are significantly higher than or close to that of the best-performing methods reported in the three contests. Experiments on the Bickley diary dataset that consists of several challenging bad quality document images also show the superior performance of our proposed method, compared with other techniques.
Keywords
document image processing; image segmentation; Bickley diary dataset; Canny edge map; DIBCO; adaptive contrast map; adaptive image contrast; background variation; degraded document image segmentation; document image binarization technique; high inter-intravariation; local image contrast; local image gradient; local threshold; local window; minimum parameter tuning; text stroke edge pixels; text variation; Degradation; Equations; Histograms; Image edge detection; Image segmentation; Mathematical model; Robustness; Adaptive image contrast; degraded document image binarization; document analysis; document image processing; pixel classification;
fLanguage
English
Journal_Title
Image Processing, IEEE Transactions on
Publisher
ieee
ISSN
1057-7149
Type
jour
DOI
10.1109/TIP.2012.2231089
Filename
6373726
Link To Document