Title :
User-defined template for identifying document type and extracting information from documents
Author :
Kochi, Tsukasa ; Saitoh, Takashi
Author_Institution :
Software Res. Center, Ricoh Co. Ltd., Yokohama, Japan
Abstract :
An automatic document entry system is described that identifies the type of document and extracts textual information, such as titles or authors, from semi-formatted document images. The system registers documents, offers easy retrieval of documents used in a daily workflow analyzes the layout structure of documents by using document specific models, and assumes that each type of document is known in advance. In this paper we focus on a method for identifying the type of document
Keywords :
document image processing; image registration; information retrieval; workflow management software; authors; automatic document entry system; daily workflow; document registration; document retrieval; document specific models; document type identification; layout structure analysis; semi-formatted document images; textual information extraction; titles; user-defined template; Costs; Data mining; Facsimile; Image analysis; Image classification; Image processing; Image retrieval; Noise robustness; Prototypes; SGML;
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
DOI :
10.1109/ICDAR.1999.791741