• DocumentCode
    3205837
  • Title

    A novel method for de-warping in Persian document images captured by cameras

  • Author

    Dehbovid, Hadi ; Razzazi, Farbod ; Alirezaii, Shapour

  • Author_Institution
    Electr. Eng. Dept., Islamic Azad Univ., Tehran, Iran
  • fYear
    2010
  • fDate
    8-10 Oct. 2010
  • Firstpage
    614
  • Lastpage
    619
  • Abstract
    In this Paper, We Proposed a novel algorithm for de-warping of Persian document images captured by the cameras. The aim of de-warping is to remove page distortions and to straighten document images captured by the cameras, so that the documents are readable to the OCR system. Recently, the industrial implementation of the images captured by digital cameras has significantly expanded. Most of the studies carries out so far in this regard have focused on the documents written in Latin and few researches have been conducted regarding Persian documents. The original idea of the proposed algorithm is based on the segmentation of the components of texts. In this algorithm, an effective technique is offered for detection of the upper and lower baselines, which is used in estimation of the slope of the words. Moreover, vertical shift of the warped words is done through fitting a quadratic curve fitted to the centers of the words in a line in relation to the horizontal line. The suggested algorithm is examined by qualitative and quantitative measures and the results of its implementation on various documents indicate a 92% accuracy of the proposed technique in correction of the location and angle of the words.
  • Keywords
    cameras; curve fitting; distortion; document image processing; image denoising; image segmentation; optical character recognition; text analysis; OCR system; Persian document image dewarping; digital cameras; page distortion removal; quadratic curve fitting; slope estimation; text segmentation; Algorithm design and analysis; Digital cameras; Image restoration; Image segmentation; Mathematical model; Noise measurement; Geometric Distortion; Image Archives; OCR; camera based OCR; component;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Information Systems and Industrial Management Applications (CISIM), 2010 International Conference on
  • Conference_Location
    Krackow
  • Print_ISBN
    978-1-4244-7817-0
  • Type

    conf

  • DOI
    10.1109/CISIM.2010.5643524
  • Filename
    5643524