DocumentCode :
1994888
Title :
Using irregular pyramid for text segmentation and binarization of gray scale images
Author :
Loo, Poh-Kok ; Tan, Chew-Lim
Author_Institution :
Singapore Polytech., Singapore
fYear :
2003
fDate :
3-6 Aug. 2003
Firstpage :
594
Abstract :
Compared to binary images that most text extraction methods work on, gray scale images provide much more information for the extraction task. On the other hand complication also arises in determining the subject textual content from its background region (i.e. thresholding) before the actual text extraction process can begin. Differing from the usual sequence of processes where document images are binarized before the actual text extraction, this paper proposes a new method by first segmenting individual subject areas with the help of an irregular pyramid to be followed by the binarization process. This permits the focus of attention only on the appropriate subject areas for the binarization process before text recognition. Our method overcomes the difficulty in global binarization to find a single value to fit all. It also avoids the common problem in most local thresholding technique of finding a suitable window size. As shown in our experimented result, our method performed well in both text segmentation and binarization by varying the sequence of processing.
Keywords :
character recognition; document image processing; feature extraction; image segmentation; binary image; filtering process; gray scale image; irregular pyramid structure; local thresholding technique; text binarization; text extraction method; text recognition; text segmentation; Algorithm design and analysis; Colored noise; Data mining; Feature extraction; Focusing; Image analysis; Image segmentation; Information retrieval; Text analysis; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
Print_ISBN :
0-7695-1960-1
Type :
conf
DOI :
10.1109/ICDAR.2003.1227733
Filename :
1227733
Link To Document :
بازگشت