DocumentCode :
1281148
Title :
Binarization of color document images via luminance and saturation color features
Author :
Tsai, Chun-Ming ; Lee, Hsi-Jian
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Volume :
11
Issue :
4
fYear :
2002
fDate :
4/1/2002 12:00:00 AM
Firstpage :
434
Lastpage :
451
Abstract :
This paper presents a novel binarization algorithm for color document images. Conventional thresholding methods do not produce satisfactory binarization results for documents with close or mixed foreground colors and background colors. Initially, statistical image features are extracted from the luminance distribution. Then, a decision-tree based binarization method is proposed, which selects various color features to binarize color document images. First, if the document image colors are concentrated within a limited range, saturation is employed. Second, if the image foreground colors are significant, luminance is adopted. Third, if the image background colors are concentrated within a limited range, luminance is also applied. Fourth, if the total number of pixels with low luminance (less than 60) is limited, saturation is applied; else both luminance and saturation are employed. Our experiments include 519 color images, most of which are uniform invoice and name-card document images. The proposed binarization method generates better results than other available methods in shape and connected-component measurements. Also, the binarization method obtains higher recognition accuracy in a commercial OCR system than other comparable methods
Keywords :
brightness; document image processing; feature extraction; image colour analysis; image recognition; statistical analysis; binarization algorithm; color document image binarization; color features selection; commercial OCR system; connected-component measurements; decision-tree based binarization method; image background colors; image foreground colors; image recognition accuracy; luminance; luminance color features; luminance distribution; name-card document images; saturation; saturation color features; shape measurements; statistical image features extraction; thresholding methods; uniform invoice images; Character recognition; Feature extraction; Histograms; Image color analysis; Image segmentation; Image storage; Optical character recognition software; Pattern recognition; Shape measurement; Text analysis;
fLanguage :
English
Journal_Title :
Image Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1057-7149
Type :
jour
DOI :
10.1109/TIP.2002.999677
Filename :
999677
Link To Document :
بازگشت