DocumentCode :
398585
Title :
Stochastic attributed K-d tree modeling of technical paper title pages
Author :
Mao, Song ; Rosenfeld, Azriel ; Kanungo, Tapas
Author_Institution :
Nat. Libr. of Med., Bethesda, MD, USA
Volume :
1
fYear :
2003
fDate :
14-17 Sept. 2003
Abstract :
Structural information about a document is essential for structured query processing, indexing, and retrieval. A document page can be partitioned into a hierarchy of homogeneous regions such as columns, paragraphs, etc.; these regions are called physical components, and define the physical layout of the page. In this paper we develop a class of models for the physical layouts of technical paper title pages. We model physical layout using hidden semiMarkov models for directional projections of page regions, and a stochastic attributed K-d tree grammar model for the 2D hierarchical structure of these regions. We use the models to generate sets of synthetic title page images of three distinctive styles, which we use in controlled experiments on page structure analysis.
Keywords :
hidden Markov models; image retrieval; 2D hierarchical structure; document page; hidden semiMarkov models; homogeneous regions; image indexing; image retrieval; physical components; stochastic attributed K-d tree modeling; structured query processing; synthetic title page images; technical paper title pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on
ISSN :
1522-4880
Print_ISBN :
0-7803-7750-8
Type :
conf
DOI :
10.1109/ICIP.2003.1247016
Filename :
1247016
Link To Document :
بازگشت