DocumentCode
3340612
Title
A study for high performance character extraction from color scene images
Author
Shirai, Keiichiro ; Wakabayashi, Masanori ; Okamoto, Masayuki ; Yamamoto, Hiroaki
fYear
2008
fDate
16-19 Sept. 2008
Firstpage
293
Lastpage
298
Abstract
This paper describes a method for extracting character strings from scene images. Most characters on scene images appear with the same color and font size at every word or text line. In our algorithm, a scene image is divided into several blocks based on edges in the color space at first. Then the blobs, which consist of similar color pixels, are extracted by a clustering in a color space for each block. Although these blobs are correspond to characters or background patterns, after connecting them using these aspect ratios and pitches, SVM (Support Vector Machine) on several textural features of these blobs will classify each connected blob into character or background patterns. Testing with 251 images from ICDAR 2003 Text Locating Competition shows effectiveness of our algorithm.
Keywords
Computational intelligence; Data mining; Discrete cosine transforms; Filters; Image edge detection; Image segmentation; Layout; Machine learning; Pulse modulation; Text analysis; Character extraction; Clustering; SVM;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location
Nara
Print_ISBN
978-0-7695-3337-7
Type
conf
DOI
10.1109/DAS.2008.57
Filename
4669973
Link To Document