Title :
A Chinese document layout analysis method based on minimal spanning tree clustering
Author :
Tian, Xue-dong ; Zhang, Chong
Author_Institution :
Fac. of Math. & Comput. Sci., Hebei Univ., China
Abstract :
For adapting to some special characteristics of Chinese documents, a method based on minimal spanning tree clustering is presented. This method is a bottom-up approach. First apply run-length smoothing algorithm on the document in horizontal direction, and then in vertical direction. After that, minimal spanning tree clustering is applied. We can infer from experiments that the problem of Chinese document layout analysis can be resolved in a better way.
Keywords :
document image processing; image classification; image segmentation; image texture; pattern clustering; text analysis; Chinese document layout analysis; bottom-up approach; document image processing; minimal spanning tree clustering; run length smoothing algorithm; top-down method; Character recognition; Clustering algorithms; Computer science; Graphics; Layout; Mathematics; Noise generators; Optical character recognition software; Smoothing methods; Text analysis;
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
DOI :
10.1109/ICMLC.2003.1260127