DocumentCode :
2825497
Title :
A Web-based System for Retrieving Document Images from Digital Library
Author :
Zhang, Li ; Lu, Yue ; Tan, Chew Lim
Author_Institution :
National University of Singapore, Kent Ridge
Volume :
3
fYear :
2003
fDate :
16-22 June 2003
Firstpage :
27
Lastpage :
27
Abstract :
A web-based system for retrieving imaged documents from a digital library is described in this paper. First, some image preprocessing is performed off-line on the underlying imaged document to extract its word objects. Then, each word object is represented by a string known as its feature code, based on which a feature code file of the corresponding document is constructed. On the web interface side, the system allows the user to input a set of query words and indicate either to perform "AND" or "OR" operation on them. Once receiving user\´s request, the system will process each query word and combine the results based on the "AND" or "OR" operation the user has chosen. As for each query word, it is first looked up in an index table that stores words being queried before. If matches are found, results will be retrieved from the index table directly and stored temporarily for subsequent merging. This speeds up searching and makes the system an incremental intelligence system. Otherwise, the system will convert the query word to a feature code string and employ a partial word matching approach to perform search on the pre-generated feature code files. Preliminary experimental results with the imaged documents of students\´ theses provided by our digital library show that the proposed system is efficient and promising for document image retrieval, and thus has potential applications to digital libraries.
Keywords :
Character recognition; Data mining; Image converters; Image databases; Image retrieval; Image segmentation; Information retrieval; Layout; Optical character recognition software; Software libraries;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition Workshop, 2003. CVPRW '03. Conference on
Conference_Location :
Madison, Wisconsin, USA
ISSN :
1063-6919
Print_ISBN :
0-7695-1900-8
Type :
conf
DOI :
10.1109/CVPRW.2003.10025
Filename :
4624285
Link To Document :
بازگشت