• DocumentCode
    183457
  • Title

    Multiple Training - One Test Methodology for Handwritten Word-Script Identification

  • Author

    Ferrer, Miguel A. ; Morales, Aythami ; Rodriguez, N. ; Pal, Umapada

  • Author_Institution
    Innovacion en Comun., Univ. de Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain
  • fYear
    2014
  • fDate
    1-4 Sept. 2014
  • Firstpage
    754
  • Lastpage
    759
  • Abstract
    Script identification is an important area in handwriting document image analysis field. The script identification at word level on documents written in multiple scripts is an open challenge for the scientific community and a real concern in countries with multiple official languages, e. G. The country like India. Such documents usually contain two scripts: the most of the document are written in the regional script while some words, acronyms or numbers are written in Roman script. In this case a word or even a character level script identification is required to locate the second script characters in the document. Here the major problem is the few script descriptors available for the script estimation which convey high error rates. The literatures try to address this problem by looking for more efficient descriptors. In this paper we propose a Multiple Training - One Test technique to alleviate this problem. Several classifiers are trained, each one with words of similar amount of information. A scale invariable word information index is defined for this sake. To identify the script of a query word, its word information index is worked out, and its script is identified with the most appropriate classifier. Accuracy improvements has been obtained with this promising technique, especially for the shorten words.
  • Keywords
    document image processing; handwritten character recognition; support vector machines; vocabulary; Roman script; character level script identification; handwriting document image analysis field; handwritten word-script identification; multiple official languages; multipletraining; regional script estimation; scientific community; texture descriptors; Accuracy; Feature extraction; Histograms; Indexes; Testing; Training; Document Analysis; Handwritten Script Identification; Multiple training; Texture descriptors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
  • Conference_Location
    Heraklion
  • ISSN
    2167-6445
  • Print_ISBN
    978-1-4799-4335-7
  • Type

    conf

  • DOI
    10.1109/ICFHR.2014.132
  • Filename
    6981111