Title of article :
Data-based interval estimation of classification error rates
Author/Authors :
Krzanowski، W. J. نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2001
Abstract :
Leave-one-out and 632 bootstrap are popular data-based methods of estimating the true error rate of a classification rule, but practical applications almost exclusively quote only point estimates. Interval estimation would provide better assessment of the future performance of the rule, but little has been published on this topic. We first review general-purpose jackknife and bootstrap methodology that can be used in conjunction with leave-one-out estimates to provide prediction intervals for true error rates of classification rules. Monte Carlo simulation is then used to investigate coverage rates of the resulting intervals for normal data, but the results are disappointing; standard intervals show considerable overinclusion, intervals based on Edgeworth approximations or random weighting do not perform well, and while a bootstrap approach provides intervals with coverage rates closer to the nominal ones there is still marked underinclusion. We then turn to intervals constructed from 632 bootstrap estimates, and show that much better results are obtained. Although there is now some overinclusion, particularly for large training samples, the actual coverage rates are sufficiently close to the nominal rates for the method to be recommended. An application to real data illustrates the considerable variability that can arise in practical estimation of error rates.
Keywords :
A. Organic compounds , A. Superconductors , D. Phase transitions , D. Spin-density waves
Journal title :
JOURNAL OF APPLIED STATISTICS
Journal title :
JOURNAL OF APPLIED STATISTICS