• DocumentCode
    2010944
  • Title

    Document Preprocessing System - Automatic Selection of Binarization

  • Author

    Messaoud, I.B. ; Amiri, Hamid ; Abed, H.E. ; Margner, Volker

  • Author_Institution
    Lab. des Syst. et Traitement de Signal (LSTS), Ecole Nat. d´´Ing. de Tunis (ENIT), Tunis, Tunisia
  • fYear
    2012
  • fDate
    27-29 March 2012
  • Firstpage
    85
  • Lastpage
    89
  • Abstract
    Due to the reason that historical documents present many degradations, the analysis of such documents is considered as a big challenge. In this paper we present a system which allows automatic preprocessing of historical documents. One or many preprocessing methods, as well as sets of input parameters are selected for each book from the used database according to the input image features. Such selection is tested on a subset of every book during the training step, the validation of the carried results is performed on another subset of images. If the validation is not well checked, the training is repeated. The proposed system is applied on a set of books from the Google-Books (23 books, 1000 images) and the Bayerische Staatsbibliothek (10 books, 750 images) collections. The performed results are very promising.
  • Keywords
    document image processing; feature extraction; history; Bayerische Staatsbibliothek collections; Google Books; automatic binarization selection; document analysis; historical document preprocessing system; input image features; training step; Databases; Frequency modulation; Measurement; PSNR; Text analysis; Training; Historical document analysis; automatic parameters selection; binarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
  • Conference_Location
    Gold Cost, QLD
  • Print_ISBN
    978-1-4673-0868-7
  • Type

    conf

  • DOI
    10.1109/DAS.2012.31
  • Filename
    6195340