• DocumentCode
    3306674
  • Title

    Clustering and classification of document structure-a machine learning approach

  • Author

    Dengel, Andreas ; Dubiel, Frank

  • Author_Institution
    German Res. Center for Artificial Intelligence, Kaiserslautern, Germany
  • Volume
    2
  • fYear
    1995
  • fDate
    14-16 Aug 1995
  • Firstpage
    587
  • Abstract
    We describe a system which is capable of learning the presentation of document logical structures, exemplarily shown for business letters. Presenting a set of instances to the system, it clusters them into structural concepts and induces a concept hierarchy. This concept hierarchy is taken as a source for classifying future input. The paper introduces the different learning steps, describes how the resulting concept hierarchy is applied for logical labeling and reports on the results
  • Keywords
    business data processing; classification; document handling; knowledge based systems; learning by example; technical presentation; business letters; concept hierarchy; document logical structure presentation; document structure classification; document structure clustering; learning by example; logical labeling; machine learning approach; Artificial intelligence; Classification tree analysis; Costs; Decision trees; Fuzzy logic; Information retrieval; Labeling; Logic testing; Machine learning; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    0-8186-7128-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.1995.601965
  • Filename
    601965