• DocumentCode
    2540362
  • Title

    A handwriting textline extraction approach based on connected domain

  • Author

    Gao, Wei ; Sun, Fuchun ; Yin, Zhonghang

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • fYear
    2010
  • fDate
    7-9 July 2010
  • Firstpage
    217
  • Lastpage
    222
  • Abstract
    This paper describes an approach for extracting words, textlines and text blocks by analyzing the spatial configuration of connected domain and word contour rectangles on a given document image. The basic idea is that connected components of black pixels and contours can be used as computational units in document image analysis. In this paper, we try to find a spatial feature and overlapped relationships for every contour rectangle, and we call this feature rectangle “Standard Rectangle”(SR). Then we calculate the split line of every textline according to a series of operations of SRs, and separate the word contour rectangles to different lines. In the next step we estimate that if the adjacent textlines is overlapped. If it is, we calculate the overlap distance and move the word contour rectangles according to it. Our experiment show the approach does good work on both overlapped textlines and detached textlines.
  • Keywords
    document image processing; edge detection; handwriting recognition; text analysis; document image analysis; handwriting textline extraction approach; word contour rectangles; Artificial neural networks; Layout; Noise; Pixel; Signal processing algorithms; Strontium; Surface morphology; connected domain; handwriting; textline extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cognitive Informatics (ICCI), 2010 9th IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-8041-8
  • Type

    conf

  • DOI
    10.1109/COGINF.2010.5599738
  • Filename
    5599738