DocumentCode :
1992776
Title :
A model-based line detection algorithm in documents
Author :
Zheng, Yefeng ; Li, Huiping ; Doermann, David
Author_Institution :
Lab. for Language & Media Process., Maryland Univ., College Park, MD, USA
fYear :
2003
fDate :
3-6 Aug. 2003
Firstpage :
44
Abstract :
In this paper we present a novel model based approach to detect severely broken parallel lines in noisy textual documents. It is important to detect and remove these lines so the text can be segmented and recognized. We use directional single-connected chain, a vectorization based algorithm, to extract the line segments. We then instantiate a parallel line model with three parameters: the skew angle, the vertical line gap, and the vertical translation. A coarse-to-fine approach is used to improve the estimation accuracy. From the model we can incorporate the high level contextual information to enhance detection results even when lines are severely broken. Our experimental results show our method can detect 94% of the lines in our database with 168 noisy Arabic document images.
Keywords :
image enhancement; image processing; image recognition; image segmentation; optical character recognition; text analysis; coarse-to-fine approach; directional single-connected chain; estimation accuracy; line segment extraction; model-based line detection algorithm; noisy Arabic document images; noisy textual documents; parallel line model; severely broken parallel lines; skew angle; text recognition; text segmentation; vectorization based algorithm; vertical line gap; vertical translation; Concurrent computing; Context modeling; Data mining; Detection algorithms; Educational institutions; Electronic mail; Image segmentation; Laboratories; Optical character recognition software; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
Print_ISBN :
0-7695-1960-1
Type :
conf
DOI :
10.1109/ICDAR.2003.1227625
Filename :
1227625
Link To Document :
بازگشت