Title :
A fuzzy approach for word level script identification of text in low resolution display board images using wavelet features
Author :
Angadi, S.A. ; Kodabagi, M.M.
Author_Institution :
Dept. of Comput. Sci. & Eng., Basaveshwar Eng. Coll., Bagalkot, India
Abstract :
Automated systems for understanding low resolution images of display boards are facilitating several new applications such as blind assistants, tour guide systems, location aware systems and many more. Script identification at character/word level is one of the very important pre-processing steps for development of such systems prior to further image analysis. In this paper, a new fuzzy based approach for word level script identification of text in low resolution images of display boards is presented. The proposed methodology uses horizontal run statistics and wavelet features for distinguishing 5 Indian scripts namely; Hindi, Kannada, English, Malyalam and Tamil. The method works in two phases; In the first phase, the wavelet transform based texture features such as zone wise wavelet energy features, vertical run statistical features of wavelet coefficients and wavelet log mean deviation features of decomposed energy bands at 2 levels are obtained from training word images and crisp sets are constructed, one for each script/language under study. The second phase is testing, in which test word image is processed to obtain horizontal run statistics to determine whether it belongs to Hindi script. Otherwise, the word image is processed to obtain a crisp vector. The degree of belongingness of crisp vector with each candidate object in the crisp sets is determined using newly devised fuzzy membership function. Further, fuzzy inference scheme is used to identify the script of the test word image. The proposed method is robust and insensitive to the variations in size and style of font, number of characters, thickness and spacing between characters, noise, and other degradations. The proposed method achieves an overall script identification accuracy of 94.33% and individual identification accuracy of 100% for Hindi Script, 98.67% for Kannada Script, 100% for English, 89% for Malyalam and 84% for Tamil Script.
Keywords :
display devices; document image processing; fuzzy reasoning; image texture; text analysis; wavelet transforms; English script; Hindi script; Indian scripts; Kannada script; Malyalam script; Tamil script; blind assistants; decomposed energy bands; fuzzy approach; fuzzy inference scheme; fuzzy membership function; horizontal run statistics; image analysis; location aware systems; low resolution display board images; test word image; tour guide systems; vertical run statistical features; wavelet coefficients; wavelet log mean deviation features; wavelet transform based texture features; word level text script identification; zone wise wavelet energy features; Image resolution; Systematics; Fuzzy Approach; Low Resolution Display Board Images; Script Identification; Wavelet Features;
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI), 2013 International Conference on
Conference_Location :
Mysore
Print_ISBN :
978-1-4799-2432-5
DOI :
10.1109/ICACCI.2013.6637455