• DocumentCode
    1415572
  • Title

    A statistical, nonparametric methodology for document degradation model validation

  • Author

    Kanungo, Tapas ; Haralick, Robert M. ; Baird, Henry S. ; Stuezle, Werner ; Madigan, David

  • Author_Institution
    Center for Autom. Res., Maryland Univ., College Park, MD, USA
  • Volume
    22
  • Issue
    11
  • fYear
    2000
  • fDate
    11/1/2000 12:00:00 AM
  • Firstpage
    1209
  • Lastpage
    1223
  • Abstract
    Printing, photocopying, and scanning processes degrade the image quality of a document. Statistical models of these degradation processes are crucial for document image understanding research. In this paper, we present a statistical methodology that can be used to validate local degradation models. This method is based on a nonparametric, two-sample permutation test. Another standard statistical device, the power function, is then used to choose between algorithm variables such as distance functions. Since the validation and the power function procedures are independent of the model, they can be used to validate any other degradation model. A method for comparing any two models is also described. It uses p-values associated with the estimated models to select the model that is closer to the real world.
  • Keywords
    document image processing; optical character recognition; parameter estimation; statistical analysis; document degradation model; model validation; nonparametric statistical test; optical character recognition; parameter estimation; simulation models; statistical models; two-sample permutation test; Algorithm design and analysis; Control system synthesis; Degradation; Electric breakdown; Image quality; Optical character recognition software; Optimal control; Predictive models; System performance; Testing;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/34.888707
  • Filename
    888707