• DocumentCode
    985106
  • Title

    Character recognition without segmentation

  • Author

    Rocha, Jairo ; Pavlidis, Theo

  • Author_Institution
    Dept. de Math. Inf., Univ. de les Illes Balears, Palma de Mallorca, Spain
  • Volume
    17
  • Issue
    9
  • fYear
    1995
  • fDate
    9/1/1995 12:00:00 AM
  • Firstpage
    903
  • Lastpage
    909
  • Abstract
    A segmentation-free approach to OCR is presented as part of a knowledge-based word interpretation model. It is based on the recognition of subgraphs homeomorphic to previously defined prototypes of characters. Gaps are identified as potential parts of characters by implementing a variant of the notion of relative neighborhood used in computational perception. Each subgraph of strokes that matches a previously defined character prototype is recognized anywhere in the word even if it corresponds to a broken character or to a character touching another one. The characters are detected in the order defined by the matching quality. Each subgraph that is recognized is introduced as a node in a directed net that compiles different alternatives of interpretation of the features in the feature graph. A path in the net represents a consistent succession of characters. A final search for the optimal path under certain criteria gives the best interpretation of the word features. Broken characters are recognized by looking for gaps between features that may be interpreted as part of a character. Touching characters are recognized because the matching allows nonmatched adjacent strokes. The recognition results for over 24,000 printed numeral characters belonging to a USPS database and on some hand-printed words confirmed the method´s high robustness level
  • Keywords
    graph theory; knowledge based systems; optical character recognition; OCR; broken character recognition; computational perception; hand-printed words; homeomorphic subgraph recognition; knowledge-based word interpretation model; nonmatched adjacent strokes; optimal path search; printed numeral characters; robustness; touching character recognition; Character recognition; Computer science; Decision trees; Feature extraction; Optical character recognition software; Performance analysis; Prototypes; Robustness; Spatial databases;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/34.406657
  • Filename
    406657