Title :
A Rule Based Approach for Skew Correction and Removal of Insignificant Data from Scanned Text Documents of Devanagari Script
Author :
Sharma, Pramod Kumar ; Dhingra, Kapil Dev ; Sanyal, Sudip
Author_Institution :
IIT Allahabad India, Allahabad
Abstract :
In this paper we have presented a rule based approach for removing insignificant data and skew from scanned documents of Devanagari script. To develop an OCR system for Devanagari script is not an easy job hence proper preprocessing of these scanned documents requires noise removal and correcting skew from the image. The proposed system is based on rule based methods, morphological operations and connected component labeling. Images used for the experiment are binarised grayscale images. Experiments and results show that presented method is robust for preprocessing scanned images of Devanagari text documents.
Keywords :
document image processing; mathematical morphology; text analysis; Devanagari script; binarised grayscale images; connected component labeling; morphological operations; of insignificant data removal; rule based approach; scanned text documents; skew correction; Books; Focusing; Gray-scale; Internet; Labeling; Morphological operations; Noise robustness; Optical character recognition software; Printing; Thumb; Devanagari Script; Insignificant Data; OCR; Preprocessing; Skew Correction;
Conference_Titel :
Signal-Image Technologies and Internet-Based System, 2007. SITIS '07. Third International IEEE Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3122-9
DOI :
10.1109/SITIS.2007.93