Title :
Probabilistic interpage analysis for article extraction from document images
Author :
Takasu, Atsuhiro
Author_Institution :
Res. & Dev. Dept., Nat. Center for Sci. Inf. Syst., Tokyo, Japan
Abstract :
The progress of information processing and utilization technologies enables one to handle a huge information space. It is important to utilize the information stored in printed papers for constructing a huge information space. This paper presents an interpage analysis method for processing one issue of a journal and a magazine at one time. The method presented is based on the error correcting parsing technique which enables one to handle errors of recognition processes. The sequence of pages are parsed in two levels, the journal level and the logical component level, to extract logical components of journals and magazines
Keywords :
document image processing; feature extraction; grammars; optical character recognition; pattern classification; probabilistic logic; article extraction; character recognition; document image processing; error correcting parsing; interpage analysis; logical analysis; page classification; page layout analysis; probabilistic grammar; Image analysis;
Conference_Titel :
Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on
Conference_Location :
Brisbane, Qld.
Print_ISBN :
0-8186-8512-3
DOI :
10.1109/ICPR.1998.711387