DocumentCode :
2147601
Title :
Document Image Classification and Labeling Using Multiple Instance Learning
Author :
Kumar, Jayant ; Pillai, Jaishanker ; Doermann, David
Author_Institution :
Inst. of Adv. Comput. Studies, Univ. of Maryland, College Park, MD, USA
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
1059
Lastpage :
1063
Abstract :
The labeling of large sets of images for training or testing analysis systems can be a very costly and time-consuming process. Multiple instance learning (MIL) is a generalization of traditional supervised learning which relaxes the need for exact labels on training instances. Instead, the labels are required only for a set of instances known as bags. In this paper, we apply MIL to the retrieval and localization of signatures and the retrieval of images containing machine-printed text, and show that a gain of 15-20% in performance can be achieved over the supervised learning with weak-labeling. We also compare our approach to supervised learning with fully annotated training data and report a competitive accuracy for MIL. Using our experiments on real-world datasets, we show that MIL is a good alternative when the training data has only document-level annotation.
Keywords :
document image processing; image classification; image retrieval; learning (artificial intelligence); document image classification; document image labeling; document-level annotation; fully annotated training data; image retrieval; machine-printed text; multiple instance learning; signature localization; signature retrieval; testing analysis systems; traditional supervised learning; training analysis systems; weak-labeling; Feature extraction; Handwriting recognition; Histograms; Image segmentation; Supervised learning; Support vector machines; Training; Document Image Labeling; Machine-print Documents; Signature Detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location :
Beijing
ISSN :
1520-5363
Print_ISBN :
978-1-4577-1350-7
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2011.214
Filename :
6065472
Link To Document :
بازگشت