Title :
Restoring Chinese documents images based on text boundary lines
Author :
Liu, Hong ; Ding, Runwei
Author_Institution :
Key Lab. of Machine Perception & Intell., Peking Univ., Beijing, China
Abstract :
Distortion always appears in document images while scanning thick bound volumes. There are two kinds of distortion for the scanned grayscale images, shadow appears at the volumes´ spine area, and warping of the words occurs in the shadow. In this paper, a novel text boundary lines based method for efficient restoration of warped scanning Chinese document images is presented. We first detect on which side of an image the shadow lays by row grayscale analysis method. Then the shadow is removed by a modified Niblack´s algorithm. In order to detect the warped feature, a text boundary lines´ detection method is proposed. Finally, an adjustment method based on the text boundary lines is carried to restore the warped words. Experiments on 400 various scanning Chinese document images are implemented. The improvement on average character recall is 11.92% to 14.89%. Experiments show that the proposed restoration method is efficient for Chinese documents with both text and non-text regions.
Keywords :
distortion; document image processing; image denoising; image restoration; object detection; text analysis; Chinese documents image restoration; adjustment method; document image distortion; image detection; modified Niblack algorithm; row grayscale analysis method; scanned grayscale images; text boundary lines based method; warped scanning Chinese document images; Books; Computer vision; Gray-scale; Image processing; Image reconstruction; Image restoration; Laboratories; Machine intelligence; Shape; Surface reconstruction; distortion; restoration; text bounary lines; warped document images;
Conference_Titel :
Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on
Conference_Location :
San Antonio, TX
Print_ISBN :
978-1-4244-2793-2
Electronic_ISBN :
1062-922X
DOI :
10.1109/ICSMC.2009.5346660