• DocumentCode
    2870477
  • Title

    Stochastic error-correcting parsing for OCR post-processing

  • Author

    Perez-Cortes, Juan C. ; Amengual, Juan C. ; Arlandis, Joaquim ; Llobet, Rafael

  • Author_Institution
    Inst. Tecnologico de Inf., Univ. Politecnica de Valencia, Spain
  • Volume
    4
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    405
  • Abstract
    In this paper, stochastic error-correcting parsing is proposed as a powerful and flexible method to post-process the results of an optical character recognizer (OCR). Deterministic and nondeterministic approaches are possible under the proposed setting. The basic units of the model can be words or complete sentences, and the lexicons or the language databases can be simple enumerations or may convey probabilistic information from the application domain
  • Keywords
    document image processing; error correction; grammars; optical character recognition; stochastic processes; OCR post-processing; deterministic approach; language databases; nondeterministic approach; optical character recognizer; probabilistic information; stochastic error-correcting parsing; Character recognition; Databases; Error correction; Handwriting recognition; Hidden Markov models; Optical character recognition software; Optical sensors; Stochastic processes; Text recognition; Uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2000. Proceedings. 15th International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-0750-6
  • Type

    conf

  • DOI
    10.1109/ICPR.2000.902944
  • Filename
    902944