Title :
Multipage document retrieval by textual and visual representations
Author :
Rusinol, Marcal ; Karatzas, Dimosthenis ; Bagdanov, Andrew D. ; Llados, Josep
Author_Institution :
Dept. Cienc. de la Computacio, Univ. Autonoma de Barcelona, Bellaterra, Spain
Abstract :
In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results are reported on a large dataset of document images sampled from a banking workflow.
Keywords :
banking; document image processing; information retrieval; bag-of-words framework; banking workflow; document images; document pages; fusion strategies; multipage administrative document image retrieval system; single page retrieval system; textual representations; visual representations; Banking; Histograms; Image retrieval; Semantics; Vectors; Visualization; Vocabulary;
Conference_Titel :
Pattern Recognition (ICPR), 2012 21st International Conference on
Conference_Location :
Tsukuba
Print_ISBN :
978-1-4673-2216-4