• DocumentCode
    3019770
  • Title

    A generic method for determining the up/down orientation of text in Roman and non-Roman scripts

  • Author

    Aradhye, Hrishikesh B.

  • Author_Institution
    SRI Int., Menlo Park, CA, USA
  • fYear
    2005
  • fDate
    29 Aug.-1 Sept. 2005
  • Firstpage
    187
  • Abstract
    This paper presents a method for determining the up/down orientation of text in a scanned document of unknown orientation. The method analyzes the "open" portions of text blobs to determine the direction in which the open portions face. By determining the respective densities of blobs opening in a pair of opposite directions (e.g., right or left), the method can establish the direction in which the text as a whole is oriented. We first discuss the orientation of Roman text based on the asymmetry in the openness of Roman letters in the horizontal direction. For non-Roman text such as Pashto and Hebrew, we determine a direction that is the most asymmetric, and therefore the most useful for orientation, given a training dataset. This direction is then used for orientation. This work can be used for automated orientation of mail, checks in ATM envelopes, and scanned, copied, or faxed documents.
  • Keywords
    document image processing; natural languages; optical character recognition; text analysis; Roman scripts; document image processing; natural languages; non-Roman scripts; optical character recognition; text analysis; text blobs; training dataset; up-down text orientation; Automation; Character recognition; Facsimile; Frequency; Ink; Optical character recognition software; Postal services; Real time systems; Text recognition; Watermarking;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
  • ISSN
    1520-5263
  • Print_ISBN
    0-7695-2420-6
  • Type

    conf

  • DOI
    10.1109/ICDAR.2005.13
  • Filename
    1575535