DocumentCode
2149266
Title
Script-Free Text Line Segmentation Using Interline Space Model for Printed Document Images
Author
Kim, Minwoo ; Oh, Il-Seok
Author_Institution
Div. of Comput. Sci. & Eng., Chonbuk Nat. Univ., Jeonju, South Korea
fYear
2011
fDate
18-21 Sept. 2011
Firstpage
1354
Lastpage
1358
Abstract
This paper proposes a model-based text line segmentation algorithm for machine-printed document images. The model is based on geometric configuration which uses the interline spaces rather than the text lines. The paper proposes an objective function whose maximization leads to the optimal solution. The proposed interline space model provides the primary advantage of script-free nature. Additionally the model is versatile due to its abilities of processing both horizontally and vertically written documents and inferring the semantic of reading order. The experiments performed with various document images in Latin, Korean, Chinese, and Japanese scripts have proven the aforementioned advantages and have shown the noise tolerance.
Keywords
document image processing; image segmentation; optimisation; text analysis; Chinese scripts; Japanese scripts; Korean scripts; Latin scripts; geometric configuration; interline space model; machine printed document image processing; maximization; model based text line segmentation algorithm; noise tolerance; objective function; optimal solution; script free text line segmentation; written document processing; Algorithm design and analysis; Analytical models; Floors; Image segmentation; Noise; Pattern analysis; Text analysis; geometric matching; interline space; model-based approach; reading order; text line segmentation;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location
Beijing
ISSN
1520-5363
Print_ISBN
978-1-4577-1350-7
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2011.272
Filename
6065531
Link To Document