• DocumentCode
    3290237
  • Title

    A methodology to spot words in historical Arabic documents

  • Author

    Zirari, F. ; Ennaji, Abdellatif ; Nicolas, S. ; Mammass, D.

  • Author_Institution
    LITIS Lab., Univ. of Rouen, Rouen, France
  • fYear
    2013
  • fDate
    27-30 May 2013
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Libraries contain huge amounts of Arabic printed historical documents which cannot be available on-line because they do not have a searchable index. The word spotting idea has previously been suggested as a solution to create indexes for such a collection of documents by matching word images. In this paper we present a word spotting method for Arabic printed historical document. We start with word segmentation using run length smoothing algorithm. The description of the features selected to represent the words images is given afterwards. Elastic Dynamic Time Warping is used for matching the features of the two words. This method was tested on the Arabic historical printed document database of Moroccan National Library (MNL).
  • Keywords
    digital libraries; document image processing; feature extraction; history; image matching; image representation; image segmentation; indexing; optical character recognition; smoothing methods; word processing; Arabic printed historical document; MNL; Moroccan National Library; document collection; elastic dynamic time warping; feature selection; historical library collection; indexing; run length smoothing algorithm; word feature matching; word image matching; word image representation; word segmentation; word spotting method; Feature extraction; Image segmentation; Indexing; Libraries; Smoothing methods; Vectors; Arabic historical printed document; DTW; feature extraction; word segmentation; word spotting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Systems and Applications (AICCSA), 2013 ACS International Conference on
  • Conference_Location
    Ifrane
  • ISSN
    2161-5322
  • Type

    conf

  • DOI
    10.1109/AICCSA.2013.6616492
  • Filename
    6616492