DocumentCode :
2427660
Title :
Word Image Decomposition from Mixed Text/Graphics Images Using Statistical Methods
Author :
Jeong, Chang-bu ; Kim, Soo-Hyung
Author_Institution :
Honam Univ., Kwangju
Volume :
4
fYear :
2007
fDate :
24-27 Aug. 2007
Firstpage :
624
Lastpage :
628
Abstract :
This paper describes the development and implementation of a algorithm to extract words from image regions mixed text/graphics in document images using statistical analyses, which is a component of DIPS(Document Images Processing System) using statistical methods. To extract word images from image regions, the character components need to be separated from graphic components. For this process, we propose a method to separate them with an analysis of box-plot using a statistics of structural components. An accuracy of this method is not sensitive to the changes of images because the criterion of separation is defined by the statistics of components. And then the character regions are determined by analyzing a local crowdedness of the separated character elements. Finally, we divide the character regions into text lines and word images using projection profile analysis and gap clustering, etc. The proposed system could reduce the influence resulted from the changes of images because it uses the criterion based on the statistics of image regions.
Keywords :
document image processing; statistical analysis; text analysis; box-plot; character components; document images processing system; graphic components; image regions; mixed text/graphics images; statistical analyses; statistical methods; structural components; text lines; word extraction; word image decomposition; word images; Computer graphics; Image analysis; Image decomposition; Image retrieval; Indexes; Information retrieval; Internet; Software libraries; Statistical analysis; Statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2874-8
Type :
conf
DOI :
10.1109/FSKD.2007.617
Filename :
4406462
Link To Document :
بازگشت