DocumentCode :
3249821
Title :
Page segmentation and classification based on pattern-list analysis
Author :
Wang, Jiajun ; Li, Yanling ; Huang, Xianwu ; He, Zhenya
Author_Institution :
Sch. of Electron. & Inf. Eng., Soochow Univ., Suzhou, China
fYear :
2004
fDate :
20-22 Oct. 2004
Firstpage :
735
Lastpage :
738
Abstract :
In this paper, a new algorithm based on pattern-list analysis is proposed for page segmentation and classification. There are three steps in the algorithm: the bounding rectangle location, the pattern formation and the pattern classification, after which the patterns that may be wrongly classified are further classified by their contextual information. Experimental results show the accuracy of the algorithm in segmenting text and non-text regions, especially for the case of document images with irregular-shaped halftone regions. The algorithm is valid only for binary document images.
Keywords :
document image processing; image classification; image segmentation; text analysis; binary document images; bounding rectangle location; contextual information; irregular-shaped halftone regions; nontext regions; page segmentation; pattern classification; pattern formation; pattern-list analysis; text segmentation; Algorithm design and analysis; Automation; Humans; Image analysis; Image segmentation; Interference; Pattern analysis; Pattern classification; Pixel; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on
Print_ISBN :
0-7803-8687-6
Type :
conf
DOI :
10.1109/ISIMP.2004.1434169
Filename :
1434169
Link To Document :
بازگشت