• DocumentCode
    3695209
  • Title

    ALTID : Arabic/Latin Text Images Database for recognition research

  • Author

    Imen Chtourou;Ahmed Cheikh Rouhou;Faten Kallel Jaiem;Slim Kanoun

  • Author_Institution
    MIRACL laboratory, ISIMS, University of Sfax, Tunisia
  • fYear
    2015
  • Firstpage
    836
  • Lastpage
    840
  • Abstract
    This paper presents a new public offline database for Arabic/Latin printed and Arabic/Latin handwriting text. The database was developed to be employed in performance evaluation, result comparison and development of new methods related to document analysis and recognition. It may be used for, script identification, font identification, writer identification and word segmentation. The printed text is scanned from 731 pages of Latin and Arabic printed documents with grayscale format and 300 dpi resolutions. After preforming a manual segmentation, we obtained 1845 Arabic text and 2328 Latin text images. The handwritten dataset includes 460 Arabic and 582 Latin text-blocks which are written by 17 individuals with different ages and educational levels. Each text image of our database is provided with a ground truth file.
  • Keywords
    "Image resolution","Image segmentation","Manuals","Optical imaging","Indexes","Handwriting recognition"
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICDAR.2015.7333879
  • Filename
    7333879