Title :
Identification of text on colored book and journal covers
Author :
Sobottka, Karin ; Bunke, Horst ; Kronenberg, Heino
Author_Institution :
Inst. fur Inf., Bern Univ., Switzerland
Abstract :
An approach to automatic text location and identification of colored book and journal covers is proposed. To reduce the amount of small variations in color, a clustering algorithm is applied in a preprocessing step. Two methods have been developed for extracting text hypotheses. One is based on a top-down analysis using successive splitting of image regions. The other is a bottom-up region growing algorithm. The results of both methods are combined to robustly distinguish between text and non-text elements. Text elements are binarized using automatically extracted information about text color. The binarized text regions can be used as input for a conventional OCR module. Results are shown for parts of book and journal covers of different complexity. The proposed method is not restricted to cover pages, but can be applied to the extraction of text from other types of color images as well
Keywords :
feature extraction; image colour analysis; image segmentation; optical character recognition; text analysis; automatic text location; automatically extracted information; binarized text regions; bottom-up region growing algorithm; clustering algorithm; color images; colored book; conventional OCR module; cover pages; image regions; journal covers; preprocessing step; successive splitting; text color; text elements; text extraction; text hypotheses; text identification; top-down analysis; Books; Character recognition; Clustering algorithms; Color; Electrical capacitance tomography; Image analysis; Informatics; Mathematics; Optical character recognition software; Shape;
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
DOI :
10.1109/ICDAR.1999.791724