• DocumentCode
    2439693
  • Title

    A clustering-based approach to the separation of text strings from mixed text/graphics documents

  • Author

    He, Shoujie ; Abe, Norihiro

  • Author_Institution
    Dept. of Inf. Syst. & Comput. Sci., Nat. Univ. of Singapore, Singapore
  • Volume
    3
  • fYear
    1996
  • fDate
    25-29 Aug 1996
  • Firstpage
    706
  • Abstract
    A clustering-based approach to the separation of text from mixed text/graphics documents is presented. The approach starts from the grouping of connected components. Clustering is employed at three critical stages to improve the efficiency and effectiveness of the grouping, i.e., prior to the grouping, prior to orientation estimation, and posterior to the orientation estimation. Because of the high accuracy of the estimated orientation, not only the overgrouping but also most of undergrouping cases could be successfully handled
  • Keywords
    document image processing; image recognition; clustering-based approach; mixed text/graphics documents; orientation estimation; text string separation; Computer graphics; Computer science; Data mining; Helium; Information systems; Maximum likelihood estimation; Smoothing methods; Systems engineering and theory; Testing; Tree data structures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1996., Proceedings of the 13th International Conference on
  • Conference_Location
    Vienna
  • ISSN
    1051-4651
  • Print_ISBN
    0-8186-7282-X
  • Type

    conf

  • DOI
    10.1109/ICPR.1996.547037
  • Filename
    547037