Title :
Learning document structure for retrieval and classification
Author :
Kumar, Jayant ; Peng Ye ; Doermann, David
Author_Institution :
Univ. of Maryland, College Park, MD, USA
Abstract :
In this paper, we present a method for the retrieval of document images with chosen layout characteristics. The proposed method is based on statistics of patch-codewords over different regions of image. We begin with a set of wanted and a random set of unwanted images representative of a large heterogeneous collection. We then use raw-image patches extracted from the unlabeled images to learn a codebook. To model the spatial relationships between patches, the image is recursively partitioned horizontally and vertically, and a histogram of patch-codewords is computed in each partition. The resulting set of features give a high precision and recall for the retrieval of hand-drawn and machine-print table-documents, and unconstrained mixed form-type documents, when trained using a random forest classifier. We compare our method to the spatial-pyramid method, and show that the proposed approach for learning layout characteristics is competitive for document images.
Keywords :
document image processing; image classification; image retrieval; learning (artificial intelligence); statistics; codebook learning; document structure learning; hand-drawn-documents; heterogeneous collection; image classification; image retrieval; layout characteristics learning; machine-print table-documents; patch-codeword histogram; patch-codeword statistics; random forest classifier; raw-image patch extraction; spatial-pyramid method; unconstrained mixed form-type documents; unlabeled images; unwanted images; Accuracy; Computer vision; Feature extraction; Histograms; Layout; Radio frequency; Support vector machines;
Conference_Titel :
Pattern Recognition (ICPR), 2012 21st International Conference on
Conference_Location :
Tsukuba
Print_ISBN :
978-1-4673-2216-4