DocumentCode
3077095
Title
Text Extraction from Complex Document Images Using the Multi-plane Segmentation Technique
Author
Chen, Yen-Lin ; Wu, Bing-Fei
Author_Institution
Nat. Chiao Tung Univ., Hsinchu
Volume
4
fYear
2006
fDate
8-11 Oct. 2006
Firstpage
3540
Lastpage
3547
Abstract
This study presents a new method for extracting characters from various real-life complex document images. The proposed method applies a multi-plane segmentation technique to separate homogeneous objects including text blocks, non-text graphical objects, and background textures into individual object planes. It consists of two stages-automatic localized multilevel thresholding, and multi-plane region matching and assembling. Then a text extraction process can be performed on the resultant planes to detect and extract characters with different characteristics in the respective planes. The proposed method processes document images regionally and adaptively according to their respective local features. This allows preservation of detailed characteristics from extracted characters, especially small characters with thin strokes, as well as gradational illuminations of characters. This also permits background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Experimental results on real-life complex document images demonstrate that the proposed method is effective in extracting characters with various illuminations, sizes, and font styles from various types of complex document images.
Keywords
character recognition; document image processing; image matching; image segmentation; automatic localized multilevel thresholding; character extraction; complex document images; multiplane region matching-assembling; multiplane segmentation technique; text extraction; Assembly; Cybernetics; Data mining; Feature extraction; Image analysis; Image edge detection; Image segmentation; Lighting; Prototypes; Text analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
Conference_Location
Taipei
Print_ISBN
1-4244-0099-6
Electronic_ISBN
1-4244-0100-3
Type
conf
DOI
10.1109/ICSMC.2006.384668
Filename
4274432
Link To Document