DocumentCode
183457
Title
Multiple Training - One Test Methodology for Handwritten Word-Script Identification
Author
Ferrer, Miguel A. ; Morales, Aythami ; Rodriguez, N. ; Pal, Umapada
Author_Institution
Innovacion en Comun., Univ. de Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain
fYear
2014
fDate
1-4 Sept. 2014
Firstpage
754
Lastpage
759
Abstract
Script identification is an important area in handwriting document image analysis field. The script identification at word level on documents written in multiple scripts is an open challenge for the scientific community and a real concern in countries with multiple official languages, e. G. The country like India. Such documents usually contain two scripts: the most of the document are written in the regional script while some words, acronyms or numbers are written in Roman script. In this case a word or even a character level script identification is required to locate the second script characters in the document. Here the major problem is the few script descriptors available for the script estimation which convey high error rates. The literatures try to address this problem by looking for more efficient descriptors. In this paper we propose a Multiple Training - One Test technique to alleviate this problem. Several classifiers are trained, each one with words of similar amount of information. A scale invariable word information index is defined for this sake. To identify the script of a query word, its word information index is worked out, and its script is identified with the most appropriate classifier. Accuracy improvements has been obtained with this promising technique, especially for the shorten words.
Keywords
document image processing; handwritten character recognition; support vector machines; vocabulary; Roman script; character level script identification; handwriting document image analysis field; handwritten word-script identification; multiple official languages; multipletraining; regional script estimation; scientific community; texture descriptors; Accuracy; Feature extraction; Histograms; Indexes; Testing; Training; Document Analysis; Handwritten Script Identification; Multiple training; Texture descriptors;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location
Heraklion
ISSN
2167-6445
Print_ISBN
978-1-4799-4335-7
Type
conf
DOI
10.1109/ICFHR.2014.132
Filename
6981111
Link To Document