Title :
Skew estimation for scanned documents from "noises"
Author :
Yuan, Bo ; Tan, Chew Lim
Author_Institution :
Centre for Remote Imaging, Sensing & Process., Nat. Univ. of Singapore, Singapore
fDate :
29 Aug.-1 Sept. 2005
Abstract :
The vast majority of the published skew estimation methods for scanned document images are for textual documents. These methods are based on the principle that the skew angles can be derived from the presence of the obvious text lines. The non-textual objects, such as line drawings, photographic inserts, scan artifacts including the dark bars around the borders and the center spine of bounded materials, and media contaminations are considered as "noises", thus are subject to elimination. Skew estimators that work in the presence of excessive noises are considered robust. This paper presents a skew estimation method that is based on the straight lines or edges. It uses the Muff Transform with a probe-line mapping scheme for feature identification. Various strategies for optimized line probing are devised. This method is applicable to both textual and graphical documents scanned with ordinary scanners or copiers under normal conditions. Selected images from the University of Washington English document image database I (UWDB-I) are used for its usability evaluation.
Keywords :
document image processing; edge detection; feature extraction; image scanners; text analysis; visual databases; Muff transform; copier; edges; feature identification; graphical document; line drawings; photographic inserts; probe-line mapping scheme; scan artifacts; scanned document image; scanners; skew estimation method; straight lines; textual document; Bars; Computer science; Computer vision; Contamination; Filters; Image databases; Image edge detection; Noise robustness; Shape; Usability;
Conference_Titel :
Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
Print_ISBN :
0-7695-2420-6
DOI :
10.1109/ICDAR.2005.219