• DocumentCode
    3403805
  • Title

    Fast text line extraction in document images

  • Author

    Seong Jong Ha ; Bora Jin ; Nam Ik Cho

  • Author_Institution
    Dept. of EECS, Seoul Nat. Univ., Seoul, South Korea
  • fYear
    2012
  • fDate
    Sept. 30 2012-Oct. 3 2012
  • Firstpage
    797
  • Lastpage
    800
  • Abstract
    This paper proposes an algorithm for fast text line extraction in document image. Instead of binarization or multi-oriented Gaussian blurring of an image as in the conventional methods, we use integral image and design filters that are proper to detect text regions on the integral image. After the filtering, the center points in the regions are discovered by cascade text region verification followed by non-maximum suppression. Finally, text lines are extracted by grouping the points on the same line. The proposed method is tested with document images taken in various environments, and it is shown to be faster than the conventional ones while its performance is comparable.
  • Keywords
    Gaussian processes; document image processing; feature extraction; filtering theory; image segmentation; text detection; center points; design filters; document images; fast text line extraction; image binarization; integral image; multioriented Gaussian blurring; nonmaximum suppression; points grouping; text region detection; text region verification; Algorithm design and analysis; Complexity theory; Computer vision; Feature extraction; Histograms; Joining processes; Optical character recognition software; text line extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2012 19th IEEE International Conference on
  • Conference_Location
    Orlando, FL
  • ISSN
    1522-4880
  • Print_ISBN
    978-1-4673-2534-9
  • Electronic_ISBN
    1522-4880
  • Type

    conf

  • DOI
    10.1109/ICIP.2012.6466980
  • Filename
    6466980