Title :
Stochastic error-correcting parsing for OCR post-processing
Author :
Perez-Cortes, Juan C. ; Amengual, Juan C. ; Arlandis, Joaquim ; Llobet, Rafael
Author_Institution :
Inst. Tecnologico de Inf., Univ. Politecnica de Valencia, Spain
Abstract :
In this paper, stochastic error-correcting parsing is proposed as a powerful and flexible method to post-process the results of an optical character recognizer (OCR). Deterministic and nondeterministic approaches are possible under the proposed setting. The basic units of the model can be words or complete sentences, and the lexicons or the language databases can be simple enumerations or may convey probabilistic information from the application domain
Keywords :
document image processing; error correction; grammars; optical character recognition; stochastic processes; OCR post-processing; deterministic approach; language databases; nondeterministic approach; optical character recognizer; probabilistic information; stochastic error-correcting parsing; Character recognition; Databases; Error correction; Handwriting recognition; Hidden Markov models; Optical character recognition software; Optical sensors; Stochastic processes; Text recognition; Uncertainty;
Conference_Titel :
Pattern Recognition, 2000. Proceedings. 15th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
0-7695-0750-6
DOI :
10.1109/ICPR.2000.902944