Title :
Document Preprocessing System - Automatic Selection of Binarization
Author :
Messaoud, I.B. ; Amiri, Hamid ; Abed, H.E. ; Margner, Volker
Author_Institution :
Lab. des Syst. et Traitement de Signal (LSTS), Ecole Nat. d´´Ing. de Tunis (ENIT), Tunis, Tunisia
Abstract :
Due to the reason that historical documents present many degradations, the analysis of such documents is considered as a big challenge. In this paper we present a system which allows automatic preprocessing of historical documents. One or many preprocessing methods, as well as sets of input parameters are selected for each book from the used database according to the input image features. Such selection is tested on a subset of every book during the training step, the validation of the carried results is performed on another subset of images. If the validation is not well checked, the training is repeated. The proposed system is applied on a set of books from the Google-Books (23 books, 1000 images) and the Bayerische Staatsbibliothek (10 books, 750 images) collections. The performed results are very promising.
Keywords :
document image processing; feature extraction; history; Bayerische Staatsbibliothek collections; Google Books; automatic binarization selection; document analysis; historical document preprocessing system; input image features; training step; Databases; Frequency modulation; Measurement; PSNR; Text analysis; Training; Historical document analysis; automatic parameters selection; binarization;
Conference_Titel :
Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
Conference_Location :
Gold Cost, QLD
Print_ISBN :
978-1-4673-0868-7
DOI :
10.1109/DAS.2012.31