Title : 
Locating text based on connected component and SVM
         
        
            Author : 
Yao, Jin-liang ; Wang, Yan-qing ; Weng, Lu-bin ; Yang, Yi-Ping
         
        
            Author_Institution : 
Chinese Acad. of Sci., Beijing
         
        
        
        
        
        
        
            Abstract : 
This paper presents a novel connected component based method for locating text in complex background using support vector machine (SVM). Our method is composed of two stages. In the first stage, the cascade of threshold classifiers and support vector machine are used to identify characters. In the second stage, the identified characters are combined into texts, and then text features are extracted and used to identify text region. Two kinds of features which are character features and text features are utilized to locate text region. Character features are used to discriminate character connected components (CCs) from other objects in complex background. Text features describe the characteristics that characters in the same text have same size, color and font. The cascade of threshold classifiers can discard most non-character object, and improve the efficiency of character feature extraction. SVM is used to identify characters which the cascade of threshold classifiers can not identify. Experimental results demonstrate that the proposed approach is robust with respect to different character sizes, colors and languages, and achieves high precision which measured on the ICDAR 2003 test database.
         
        
            Keywords : 
character recognition; feature extraction; support vector machines; text analysis; visual databases; ICDAR 2003 test database; SVM; character connected components; character feature extraction; character identification; support vector machine; text feature extraction; text location; threshold classifiers; Carbon capture and storage; Character recognition; Feature extraction; Image segmentation; Optical character recognition software; Robustness; Support vector machine classification; Support vector machines; Text recognition; Wavelet analysis; Text Location; cascade of classifiers; support vector machine; text features;
         
        
        
        
            Conference_Titel : 
Wavelet Analysis and Pattern Recognition, 2007. ICWAPR '07. International Conference on
         
        
            Conference_Location : 
Beijing
         
        
            Print_ISBN : 
978-1-4244-1065-1
         
        
            Electronic_ISBN : 
978-1-4244-1066-8
         
        
        
            DOI : 
10.1109/ICWAPR.2007.4421657