• DocumentCode
    2422272
  • Title

    A system for reading low quality characters from printouts

  • Author

    Kovacs, Z.M.

  • Author_Institution
    MicroAcad. Srl, Bologna
  • Volume
    3
  • fYear
    1996
  • fDate
    25-29 Aug 1996
  • Firstpage
    185
  • Abstract
    In this paper a system is presented which is used to read low quality machine-printed characters. It is used to read computer printouts when the data file is not available. The assumptions on the characters are that their font belongs to a set of known fonts and that they are organized into tables or columns. Usually, the printer used for these documents is fast and the printing quality is low, due to the used up inked ribbon and to damaged nozzles or print head. Hence standard machine-printed OCR systems feature about 15% error rate on these sheets, a specific technique is needed. In order to cope with the recognition of broken characters and character pieces, the system is based on a two step strategy. First, it tries to match the unknown character using a moving-window technique. Then, if this fails, it creates a new reference image set using the already recognized characters of the document and repeats the first matching step. Thus, the correlation among damaged characters is used. The first step allows to reach a 2% error rate and the application of the second step lowers it to 0.15%. This low error rate is possible thanks to the ability of the system to adapt its behavior to the damaged characters produced by the printer. The average recognition time on a SUN SparcStation 10 is 15 ms/character, computed on about 100000 characters contained in 50 documents
  • Keywords
    image segmentation; optical character recognition; SUN SparcStation 10; broken characters; character pieces; computer printouts; low quality machine-printed characters; moving-window technique; two step strategy; Character recognition; Error analysis; Image quality; Image recognition; Magnetic heads; Optical character recognition software; Printers; Printing; Sun; Time sharing computer systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1996., Proceedings of the 13th International Conference on
  • Conference_Location
    Vienna
  • ISSN
    1051-4651
  • Print_ISBN
    0-8186-7282-X
  • Type

    conf

  • DOI
    10.1109/ICPR.1996.546935
  • Filename
    546935