• DocumentCode
    3143176
  • Title

    An algorithm for extracting cursive text lines

  • Author

    Bruzzone, Elisabetta ; Coffetti, Meri Cristina

  • Author_Institution
    R&D Dept., Elsag SpA, Genova, Italy
  • fYear
    1999
  • fDate
    20-22 Sep 1999
  • Firstpage
    749
  • Lastpage
    752
  • Abstract
    In this paper a new algorithm for extracting text lines from a cursive image field is described. The proposed algorithm is a fast and satisfactorily accurate procedure for isolating text lines without loss of information. The algorithm is based on the analysis of horizontal run projections and connected component grouping and splitting on a partition of the input image into vertical strips, in order to deal with undulating or skewed text. The goal of the algorithm is to prevent the ascending and descending characters from being corrupted by arbitrary cuts. The algorithm has been designed for cursive text and can also be applied to handwritten text. It maintains punctuation to allow a better performance word extraction in a subsequent phase of handwritten line processing
  • Keywords
    document image processing; edge detection; handwritten character recognition; image segmentation; ascending characters; connected component grouping; connected component splitting; cursive image field; cursive text line extraction algorithm; descending characters; handwritten line processing; handwritten text; horizontal run projections; input image partitioning; isolating text lines; skewed text; undulating text; vertical strips; word extraction; Algorithm design and analysis; Character recognition; Data mining; Image recognition; Image segmentation; Law; Legal factors; Pixel; Read only memory; Research and development;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    0-7695-0318-7
  • Type

    conf

  • DOI
    10.1109/ICDAR.1999.791896
  • Filename
    791896