• DocumentCode
    3340612
  • Title

    A study for high performance character extraction from color scene images

  • Author

    Shirai, Keiichiro ; Wakabayashi, Masanori ; Okamoto, Masayuki ; Yamamoto, Hiroaki

  • fYear
    2008
  • fDate
    16-19 Sept. 2008
  • Firstpage
    293
  • Lastpage
    298
  • Abstract
    This paper describes a method for extracting character strings from scene images. Most characters on scene images appear with the same color and font size at every word or text line. In our algorithm, a scene image is divided into several blocks based on edges in the color space at first. Then the blobs, which consist of similar color pixels, are extracted by a clustering in a color space for each block. Although these blobs are correspond to characters or background patterns, after connecting them using these aspect ratios and pitches, SVM (Support Vector Machine) on several textural features of these blobs will classify each connected blob into character or background patterns. Testing with 251 images from ICDAR 2003 Text Locating Competition shows effectiveness of our algorithm.
  • Keywords
    Computational intelligence; Data mining; Discrete cosine transforms; Filters; Image edge detection; Image segmentation; Layout; Machine learning; Pulse modulation; Text analysis; Character extraction; Clustering; SVM;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
  • Conference_Location
    Nara
  • Print_ISBN
    978-0-7695-3337-7
  • Type

    conf

  • DOI
    10.1109/DAS.2008.57
  • Filename
    4669973