• DocumentCode
    398585
  • Title

    Stochastic attributed K-d tree modeling of technical paper title pages

  • Author

    Mao, Song ; Rosenfeld, Azriel ; Kanungo, Tapas

  • Author_Institution
    Nat. Libr. of Med., Bethesda, MD, USA
  • Volume
    1
  • fYear
    2003
  • fDate
    14-17 Sept. 2003
  • Abstract
    Structural information about a document is essential for structured query processing, indexing, and retrieval. A document page can be partitioned into a hierarchy of homogeneous regions such as columns, paragraphs, etc.; these regions are called physical components, and define the physical layout of the page. In this paper we develop a class of models for the physical layouts of technical paper title pages. We model physical layout using hidden semiMarkov models for directional projections of page regions, and a stochastic attributed K-d tree grammar model for the 2D hierarchical structure of these regions. We use the models to generate sets of synthetic title page images of three distinctive styles, which we use in controlled experiments on page structure analysis.
  • Keywords
    hidden Markov models; image retrieval; 2D hierarchical structure; document page; hidden semiMarkov models; homogeneous regions; image indexing; image retrieval; physical components; stochastic attributed K-d tree modeling; structured query processing; synthetic title page images; technical paper title pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on
  • ISSN
    1522-4880
  • Print_ISBN
    0-7803-7750-8
  • Type

    conf

  • DOI
    10.1109/ICIP.2003.1247016
  • Filename
    1247016