• DocumentCode
    248625
  • Title

    A precise skew estimation algorithm for document images using KNN clustering and fourier transform

  • Author

    Fabrizio, Jonathan

  • Author_Institution
    LRDE-EPITA, Le Kremlin-Bicètre, France
  • fYear
    2014
  • fDate
    27-30 Oct. 2014
  • Firstpage
    2585
  • Lastpage
    2588
  • Abstract
    In this article, we propose a simple and precise skew estimation algorithm for binarized document images. The estimation is performed in the frequency domain. To get a precise result, the Fourier transform is not applied to the document itself but the document is preprocessed: all regions of the document are clustered using a KNN and contours of grouped regions are smoothed using the convex hull to form more regular shapes, with better orientation. No assumption has been made concerning the nature or the content of the document. This method has been shown to be very accurate and was ranked first at the DISEC´13 contest, during the ICDAR competitions.
  • Keywords
    Fourier transforms; document image processing; estimation theory; frequency-domain analysis; pattern clustering; DISEC´13 contest; Fourier transform; ICDAR competition; KNN clustering; binarized document imaging; convex hull; frequency domain estimation algorithm; skew estimation algorithm; Clustering algorithms; Estimation; Fourier transforms; Frequency-domain analysis; Robustness; Text analysis; Fourier transform; KNN; Skew estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2014 IEEE International Conference on
  • Conference_Location
    Paris
  • Type

    conf

  • DOI
    10.1109/ICIP.2014.7025523
  • Filename
    7025523