Title :
Multiple Training - One Test Methodology for Handwritten Word-Script Identification
Author :
Ferrer, Miguel A. ; Morales, Aythami ; Rodriguez, N. ; Pal, Umapada
Author_Institution :
Innovacion en Comun., Univ. de Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain
Abstract :
Script identification is an important area in handwriting document image analysis field. The script identification at word level on documents written in multiple scripts is an open challenge for the scientific community and a real concern in countries with multiple official languages, e. G. The country like India. Such documents usually contain two scripts: the most of the document are written in the regional script while some words, acronyms or numbers are written in Roman script. In this case a word or even a character level script identification is required to locate the second script characters in the document. Here the major problem is the few script descriptors available for the script estimation which convey high error rates. The literatures try to address this problem by looking for more efficient descriptors. In this paper we propose a Multiple Training - One Test technique to alleviate this problem. Several classifiers are trained, each one with words of similar amount of information. A scale invariable word information index is defined for this sake. To identify the script of a query word, its word information index is worked out, and its script is identified with the most appropriate classifier. Accuracy improvements has been obtained with this promising technique, especially for the shorten words.
Keywords :
document image processing; handwritten character recognition; support vector machines; vocabulary; Roman script; character level script identification; handwriting document image analysis field; handwritten word-script identification; multiple official languages; multipletraining; regional script estimation; scientific community; texture descriptors; Accuracy; Feature extraction; Histograms; Indexes; Testing; Training; Document Analysis; Handwritten Script Identification; Multiple training; Texture descriptors;
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location :
Heraklion
Print_ISBN :
978-1-4799-4335-7
DOI :
10.1109/ICFHR.2014.132