• DocumentCode
    3211214
  • Title

    An experimental page layout recognition system for office document automatic classification: an integrated approach for inductive generalization

  • Author

    Esposito, Floriana ; Malerba, Donato ; Semeraro, Giovanni ; Annese, Enrico ; Scafuro, Giovanna

  • Author_Institution
    Istituto di Sci. dell´´Inf., Bari Univ., Italy
  • Volume
    i
  • fYear
    1990
  • fDate
    16-21 Jun 1990
  • Firstpage
    557
  • Abstract
    A novel approach to automatic classification of digitized office documents based on the inductive generalization of their layout style, is presented. It is supported by the observation that for a number of printed documents it is possible to find a set of relevant and invariant layout features. These are geometrical characteristics automatically detected through a segmentation and layout analysis process. The learning step, in which significant examples of document classes are used to train the classification system, involves the novel idea of integrating parametric (numerical) and conceptual (symbolic) learning methods
  • Keywords
    computerised pattern recognition; document image processing; learning systems; office automation; conceptual learning; geometrical characteristics; inductive generalization; layout analysis; office document automatic classification; page layout recognition system; pattern recognition; segmentation; Background noise; Character generation; Data analysis; Document handling; Engines; Information retrieval; Knowledge acquisition; Learning systems; Multimedia systems; System testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1990. Proceedings., 10th International Conference on
  • Conference_Location
    Atlantic City, NJ
  • Print_ISBN
    0-8186-2062-5
  • Type

    conf

  • DOI
    10.1109/ICPR.1990.118164
  • Filename
    118164