• DocumentCode
    3695132
  • Title

    Arabic handwritten document preprocessing and recognition

  • Author

    Edgard Chammas;Chafic Mokbel;Laurence Likforman-Sulem

  • Author_Institution
    University of Balamand, El-Koura, Lebanon
  • fYear
    2015
  • Firstpage
    451
  • Lastpage
    455
  • Abstract
    Arabic handwritten documents present specific challenges due to the cursive nature of the writing and the presence of diacritical marks. Moreover, one of the largest labeled database of Arabic handwritten documents, the OpenHart-NIST database includes specific noise, namely guidelines, that has to be addressed. We propose several approaches to process these documents. First a guideline detection approach has been developed, based on K-means, that detects the documents that include guidelines. We then propose a series of preprocessing at text-line level to reduce the noise effects. For text-lines including guidelines, a guideline removal preprocessing is described and existing keystroke restoration approaches are assessed. In addition, we propose a preprocessing that combines noise removal and deskewing by removing line fragments from neighboring text lines, while searching for the principal orientation of the text-line. We provide recognition results, showing the significant improvement brought by the proposed processings.
  • Keywords
    "Hidden Markov models","Image recognition","Optical imaging","Optical reflection","Text recognition","Image segmentation","Writing"
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICDAR.2015.7333802
  • Filename
    7333802