DocumentCode
401883
Title
A Chinese document layout analysis method based on minimal spanning tree clustering
Author
Tian, Xue-dong ; Zhang, Chong
Author_Institution
Fac. of Math. & Comput. Sci., Hebei Univ., China
Volume
5
fYear
2003
fDate
2-5 Nov. 2003
Firstpage
3183
Abstract
For adapting to some special characteristics of Chinese documents, a method based on minimal spanning tree clustering is presented. This method is a bottom-up approach. First apply run-length smoothing algorithm on the document in horizontal direction, and then in vertical direction. After that, minimal spanning tree clustering is applied. We can infer from experiments that the problem of Chinese document layout analysis can be resolved in a better way.
Keywords
document image processing; image classification; image segmentation; image texture; pattern clustering; text analysis; Chinese document layout analysis; bottom-up approach; document image processing; minimal spanning tree clustering; run length smoothing algorithm; top-down method; Character recognition; Clustering algorithms; Computer science; Graphics; Layout; Mathematics; Noise generators; Optical character recognition software; Smoothing methods; Text analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN
0-7803-8131-9
Type
conf
DOI
10.1109/ICMLC.2003.1260127
Filename
1260127
Link To Document