• DocumentCode
    1815815
  • Title

    Investigation of binarization techniques for unevenly illuminated document images acquired via handheld cameras

  • Author

    Alqudah, Musab Kasim ; Bin Nasrudin, Mohammad F. ; Bataineh, Bilal ; Alqudah, Mashal ; Alkhatatneh, Arwa

  • Author_Institution
    Center for Artificial Intell. Technol., Univ. Kebangsaan Malaysia, Bangi, Malaysia
  • fYear
    2015
  • fDate
    21-23 April 2015
  • Firstpage
    524
  • Lastpage
    529
  • Abstract
    Cameras in handheld devices, i.e., mobile phones, have become the fastest and the easiest method for capturing document images. However, document images captured with handheld cameras have been rarely collected and investigated. Digitization of text from the captured images presents a challenge because these images are prone to non-uniform lighting, uneven illumination, skew and shadow. The objectives of this paper are first to provide a benchmark dataset of document images captured via modern handheld devices and, second, to evaluate several binarization methods (i.e., Niblack, Sauvola, Wolf, Nick and Bataineh) using this dataset and certain meaningful measurements. The results show that the Nick and Bataineh methods achieved the best results in the English Printed Document Images (EPDI) test, whereas the Nick and Sauvola methods surpassed the other methods in the Arabic Printed Document Images (APDI) test that consists of two decoration formats. The Nick method surpassed other methods in documents that did not contain Harakat, and Savoula surpassed other methods in documents that did contain Harakat.
  • Keywords
    cameras; document image processing; APDI test; Arabic printed document image test; EPDI test; English printed document images test; Nick and Bataineh methods; Nick and Sauvola methods; benchmark dataset; binarization techniques; decoration formats; handheld cameras; handheld devices; illuminated document images; mobile phones; nonuniform lighting; text digitization; Benchmark testing; Cameras; Lighting; PSNR; Standards; Visualization; Document Image Benchmark; Global Binarization; Hand-held Camera; Local Binarization; Thresholding; Uneven Illumination;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer, Communications, and Control Technology (I4CT), 2015 International Conference on
  • Conference_Location
    Kuching
  • Type

    conf

  • DOI
    10.1109/I4CT.2015.7219634
  • Filename
    7219634