DocumentCode :
2502649
Title :
Binarization of Color Characters in Scene Images Using k-means Clustering and Support Vector Machines
Author :
Kita, Kohei ; Wakahara, Toru
Author_Institution :
Fac. of Comput. & Inf. Sci., Hosei Univ., Koganei, Japan
fYear :
2010
fDate :
23-26 Aug. 2010
Firstpage :
3183
Lastpage :
3186
Abstract :
This paper proposes a new technique for binalizing multicolored characters subject to heavy degradations. The key ideas are threefold. The first is generation of tentatively binarized images via every dichotomization of k clusters obtained by k-means clustering in the HSI color space. The total number of tentatively binarized images equals 2k-2. The second is use of support vector machines (SVM) to determine whether and to what degree each tentatively binarized image represents a character or non-character. We feed the SVM with mesh and weighted direction code histogram features to output the degree of “character-likeness.” The third is selection of a single binarized image with the maximum degree of “character likeness” as an optimal binarization result. Experiments using a total of 1000 single-character color images extracted from the ICDAR 2003 robust OCR dataset show that the proposed method achieves a correct binarization rate of 93.7%.
Keywords :
document image processing; image colour analysis; pattern clustering; support vector machines; text analysis; color characters binarization; k-means clustering; multicolored characters; scene images; support vector machines; Character recognition; Feature extraction; Histograms; Image color analysis; Pixel; Support vector machines; Training data; binarization of multicolored characters; figure-ground discrimination; k-means clustering; support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
ISSN :
1051-4651
Print_ISBN :
978-1-4244-7542-1
Type :
conf
DOI :
10.1109/ICPR.2010.779
Filename :
5597180
Link To Document :
بازگشت