DocumentCode :
3777153
Title :
Extracting text from degraded document image
Author :
Radhika Patel;Suman K. Mitra
Author_Institution :
Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, India 382007
fYear :
2015
Firstpage :
1
Lastpage :
4
Abstract :
The recent era of digitization is expected to digitized many old important documents which are degraded due to various reasons. Degraded document image binarization has many challenges like intensity variation, background contrast variation, bleed through, text size variation and so on. Many approaches are available for document image binarization, but none can handle all types of degradation at once. We proposed an approach which consists of three stages such as preprocessing, Text-Area detection and post-processing. Preprocessing enhances the contrast of the image. Next stage involves identifying Text-Area. Postprocessing technique takes care of false positives and false negative based on intensity values of preprocessed and gray image. The Performance is evaluated based on various quantitative measures and is compared with the method regarded best so far. The algorithm is also expected to be independent of the script, hence is tested on Gujarati degraded document images.
Keywords :
"Image edge detection","Frequency modulation","Histograms","Image segmentation","Distortion measurement","Data mining","Degradation"
Publisher :
ieee
Conference_Titel :
Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2015 Fifth National Conference on
Type :
conf
DOI :
10.1109/NCVPRIPG.2015.7490017
Filename :
7490017
Link To Document :
بازگشت