Abstract :
In this paper, a new method for detecting text regions in natural scene images is presented. The proposed algorithm is based on the segmentation of objects in a scene, followed by the identification of text objects by a support vector machine (SVM). First, to segment objects in the scene, the input image is separated into chromatic and achromatic regions according to the distribution of red, green and blue (RGB) elements, and different clustering algorithms are applied. Second, each object is transformed into the wavelet domain for multi-resolution analysis, and moment features of the wavelet coefficients are used in the SVM for the classification of text objects. The proposed approach provides robustness to non-uniform illumination by using different clustering algorithms according to the characteristics of the colour components in the segmentation. Also, moment features, used for classification, are invariant to the size, direction, shape and other properties of texts. Experimental results demonstrate the effectiveness of this approach.
Keywords :
image colour analysis; image resolution; image segmentation; support vector machines; text analysis; multi-resolution analysis; natural scene images; object segmentation; support vector machine; text region detection; wavelet transforms; Clustering algorithms; Image segmentation; Layout; Lighting; Robustness; Support vector machine classification; Support vector machines; Wavelet analysis; Wavelet coefficients; Wavelet domain; feature extraction; natural scene image; segmentation; text detection;