• DocumentCode
    2725933
  • Title

    A Skew Resistant Method for Persian Text Segmentation

  • Author

    Shirali-Shahreza, Sajad ; Manzuri-Shalmani, M.T. ; Shirali-Shahreza, M. Hassan

  • Author_Institution
    Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran
  • fYear
    2007
  • fDate
    1-5 April 2007
  • Firstpage
    115
  • Lastpage
    120
  • Abstract
    Using OCR programs is one of the best ways to convert written and printed documents into digital form. The first phase in OCR is segmenting the input image and identifying text and non-text regions. This paper proposes a new method for segmentation of Persian printed texts which is based on the ink spread effect. Considering that the Persian scripts are very different from the English script, most methods proposed for the English script have not rendered good results for the Persian scripts. The method proposed in this paper has been designed considering the special features of the Persian scripts. In addition, one of the most important characteristics of this method is resistance to skew. Moreover, the proposed approach is directly applicable to Arabic scripts
  • Keywords
    document image processing; feature extraction; image segmentation; optical character recognition; text analysis; Arabic scripts; Persian document; Persian scripts; Persian text segmentation; image segmentation; ink spread effect; optical character recognition; page segmentation; printed documents; skew resistant method; text identification; written documents; Character recognition; Computational intelligence; Design methodology; Gray-scale; Image segmentation; Ink; Optical character recognition software; Optical signal processing; Signal processing; Signal processing algorithms; Ink Spread Effect; Optical Character Recognition (OCR); Page Segmentation; Persian Document;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Image and Signal Processing, 2007. CIISP 2007. IEEE Symposium on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    1-4244-0707-9
  • Type

    conf

  • DOI
    10.1109/CIISP.2007.369303
  • Filename
    4221404