Title :
Robust skew detection in mixed text/graphics documents
Author :
Amin, Adnan ; Wu, Sue
Author_Institution :
Sch. of Comput. Sci. & Eng., New South Wales Univ., Sydney, NSW, Australia
fDate :
29 Aug.-1 Sept. 2005
Abstract :
Document image processing has become an increasingly important technology in the automation of office documentation tasks. Automatic document scanners such as text readers and OCR (optical character recognition) systems are an essential component of systems capable of those tasks. One of the problems in this field is that the document to be read is not always placed correctly on a flat-bed scanner. This means that the document may be skewed on the scanner bed, resulting in a skewed image. This skew has a detrimental effect on document analysis, document understanding, and character segmentation and recognition. Consequently, detecting the skew of a document image and correcting it are important issues in realizing a practical document reader. The proposed skew detection algorithm has no restriction on detectable angle range and does not rely on large blocks of text. It works well on textual document images, graphical images and mixed text and graphic images. The performance of the systems was evaluated using over 60 images that consist of real life documents like envelopes and artificial mixed text/graphic icons. The skew detection algorithm is robust when compared with other methods when very few text lines are present in the document image.
Keywords :
document image processing; image segmentation; optical character recognition; text analysis; OCR systems; automatic document scanners; character segmentation; document image processing; document understanding; flat-bed scanner; graphics documents; office documentation tasks; optical character recognition; skew detection algorithm; text readers; Automation; Character recognition; Detection algorithms; Document image processing; Documentation; Graphics; Optical character recognition software; Optical devices; Robustness; Text analysis;
Conference_Titel :
Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
Print_ISBN :
0-7695-2420-6
DOI :
10.1109/ICDAR.2005.203