• DocumentCode
    2199441
  • Title

    Impact of Character Models Choice on Arabic Text Recognition Performance

  • Author

    Slimane, Fouad ; Ingold, Rolf ; Kanoun, Slim ; Alimi, Adel M. ; Hennebert, Jean

  • Author_Institution
    Dept. of Inf., Univ. of Fribourg, Fribourg, Switzerland
  • fYear
    2010
  • fDate
    16-18 Nov. 2010
  • Firstpage
    670
  • Lastpage
    675
  • Abstract
    We analyze in this paper the impact of sub-models choice for automatic Arabic printed text recognition based on Hidden Markov Models (HMM). In our approach, sub-models correspond to characters shapes assembled to compose words models. One of the peculiarities of Arabic writing is to present various character shapes according to their position in the word. With 28 basic characters, there are over 120 different shapes. Ideally, there should be one sub model for each different shape. However, some shapes are less frequent than others and, as training databases are finite, the learning process leads to less reliable models for the infrequent shapes. We show in this paper that an optimal set of models has then to be found looking for the trade-off between having more models capturing the intricacies of shapes and grouping the models of similar shapes with other. We propose in this paper different sets of sub-models that have been evaluated using the Arabic Printed Text Image (APTI) Database freely available for the scientific community.
  • Keywords
    character recognition; hidden Markov models; image recognition; text analysis; Arabic printed text image database; Arabic printed text recognition; character models choice; hidden Markov model; learning process;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
  • Conference_Location
    Kolkata
  • Print_ISBN
    978-1-4244-8353-2
  • Type

    conf

  • DOI
    10.1109/ICFHR.2010.110
  • Filename
    5693641