• DocumentCode
    2149266
  • Title

    Script-Free Text Line Segmentation Using Interline Space Model for Printed Document Images

  • Author

    Kim, Minwoo ; Oh, Il-Seok

  • Author_Institution
    Div. of Comput. Sci. & Eng., Chonbuk Nat. Univ., Jeonju, South Korea
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    1354
  • Lastpage
    1358
  • Abstract
    This paper proposes a model-based text line segmentation algorithm for machine-printed document images. The model is based on geometric configuration which uses the interline spaces rather than the text lines. The paper proposes an objective function whose maximization leads to the optimal solution. The proposed interline space model provides the primary advantage of script-free nature. Additionally the model is versatile due to its abilities of processing both horizontally and vertically written documents and inferring the semantic of reading order. The experiments performed with various document images in Latin, Korean, Chinese, and Japanese scripts have proven the aforementioned advantages and have shown the noise tolerance.
  • Keywords
    document image processing; image segmentation; optimisation; text analysis; Chinese scripts; Japanese scripts; Korean scripts; Latin scripts; geometric configuration; interline space model; machine printed document image processing; maximization; model based text line segmentation algorithm; noise tolerance; objective function; optimal solution; script free text line segmentation; written document processing; Algorithm design and analysis; Analytical models; Floors; Image segmentation; Noise; Pattern analysis; Text analysis; geometric matching; interline space; model-based approach; reading order; text line segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2011 International Conference on
  • Conference_Location
    Beijing
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4577-1350-7
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2011.272
  • Filename
    6065531