DocumentCode :
183457
Title :
Multiple Training - One Test Methodology for Handwritten Word-Script Identification
Author :
Ferrer, Miguel A. ; Morales, Aythami ; Rodriguez, N. ; Pal, Umapada
Author_Institution :
Innovacion en Comun., Univ. de Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain
fYear :
2014
fDate :
1-4 Sept. 2014
Firstpage :
754
Lastpage :
759
Abstract :
Script identification is an important area in handwriting document image analysis field. The script identification at word level on documents written in multiple scripts is an open challenge for the scientific community and a real concern in countries with multiple official languages, e. G. The country like India. Such documents usually contain two scripts: the most of the document are written in the regional script while some words, acronyms or numbers are written in Roman script. In this case a word or even a character level script identification is required to locate the second script characters in the document. Here the major problem is the few script descriptors available for the script estimation which convey high error rates. The literatures try to address this problem by looking for more efficient descriptors. In this paper we propose a Multiple Training - One Test technique to alleviate this problem. Several classifiers are trained, each one with words of similar amount of information. A scale invariable word information index is defined for this sake. To identify the script of a query word, its word information index is worked out, and its script is identified with the most appropriate classifier. Accuracy improvements has been obtained with this promising technique, especially for the shorten words.
Keywords :
document image processing; handwritten character recognition; support vector machines; vocabulary; Roman script; character level script identification; handwriting document image analysis field; handwritten word-script identification; multiple official languages; multipletraining; regional script estimation; scientific community; texture descriptors; Accuracy; Feature extraction; Histograms; Indexes; Testing; Training; Document Analysis; Handwritten Script Identification; Multiple training; Texture descriptors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location :
Heraklion
ISSN :
2167-6445
Print_ISBN :
978-1-4799-4335-7
Type :
conf
DOI :
10.1109/ICFHR.2014.132
Filename :
6981111
Link To Document :
بازگشت