• DocumentCode
    3023181
  • Title

    Algorithms for postprocessing OCR results with visual inter-word constraints

  • Author

    Hong, Tao ; Hull, Jonathan J.

  • Author_Institution
    Center of Excellence for Document Anal. & Recognition, State Univ. of New York, Buffalo, NY, USA
  • Volume
    3
  • fYear
    1995
  • fDate
    23-26 Oct 1995
  • Firstpage
    312
  • Abstract
    Algorithms are presented that determine the visual relationships between word images in a document. These include instances of common word images and common substrings that occur often in English language text images. This information is then used to improve the performance of a commercial optical character recognition (OCR) algorithm. The algorithms presented calculate clusters of equivalent word images as well as common initial and final substrings. Experimental results are presented that show a 40% reduction in word level error rate is achieved on a test set of documents degraded by uniform noise
  • Keywords
    document image processing; optical character recognition; English language text images; OCR algorithm; OCR results; experimental results; optical character recognition algorithm; performance; postprocessing algorithms; substrings; text document; uniform noise; visual interword constraints; word images; word level error rate; Character recognition; Clustering algorithms; Degradation; Error analysis; Natural languages; Noise level; Noise reduction; Optical character recognition software; Optical noise; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing, 1995. Proceedings., International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    0-8186-7310-9
  • Type

    conf

  • DOI
    10.1109/ICIP.1995.537638
  • Filename
    537638