• DocumentCode
    401883
  • Title

    A Chinese document layout analysis method based on minimal spanning tree clustering

  • Author

    Tian, Xue-dong ; Zhang, Chong

  • Author_Institution
    Fac. of Math. & Comput. Sci., Hebei Univ., China
  • Volume
    5
  • fYear
    2003
  • fDate
    2-5 Nov. 2003
  • Firstpage
    3183
  • Abstract
    For adapting to some special characteristics of Chinese documents, a method based on minimal spanning tree clustering is presented. This method is a bottom-up approach. First apply run-length smoothing algorithm on the document in horizontal direction, and then in vertical direction. After that, minimal spanning tree clustering is applied. We can infer from experiments that the problem of Chinese document layout analysis can be resolved in a better way.
  • Keywords
    document image processing; image classification; image segmentation; image texture; pattern clustering; text analysis; Chinese document layout analysis; bottom-up approach; document image processing; minimal spanning tree clustering; run length smoothing algorithm; top-down method; Character recognition; Clustering algorithms; Computer science; Graphics; Layout; Mathematics; Noise generators; Optical character recognition software; Smoothing methods; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2003 International Conference on
  • Print_ISBN
    0-7803-8131-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2003.1260127
  • Filename
    1260127