• DocumentCode
    1992849
  • Title

    The Problem of Cross-Validation: Averaging and Bias, Repetition and Significance

  • Author

    Powers, D.M.W. ; Atyabi, A.

  • Author_Institution
    Beijing Municipal Lab. for Multimedia & Intell. Software, Beijing Univ. of Technol., Beijing, China
  • fYear
    2012
  • fDate
    27-30 May 2012
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Cross-Validation (CV) is the primary mechanism used in Machine Learning to control generalization error in the absence of sufficiently large quantities of marked up (tagged or labelled) data to undertake independent testing, training and validation (including early stopping, feature selection, parameter tuning, boosting and/or fusion). Repeated Cross-Validation (RCV) is used to try to further improve the accuracy of our performance estimates, including compensating for outliers. Typically a Machine Learning researcher will the compare a new target algorithm against a wide range of competing algorithms on a wide range of standard datasets. The combination of many training folds, many CV repetitions, many algorithms and parameterizations, and many training sets, adds up to a very large number of data points to compare, and a massive multiple testing problem quadratic in the number of individual test combinations. Research in Machine Learning sometimes involves basic significance testing, or provides confidence intervals, but seldom addresses the multiple testing problem whereby the assumption of p<;.05 significance means that we expect a spurious "significant" result for 1 in 20 of our many test pairs. This paper defines and explores a protocol that reduces the scale of repeated CV whilst providing a principled way to control the erosion of significance due to multiple testing.
  • Keywords
    competitive algorithms; feature extraction; generalisation (artificial intelligence); learning (artificial intelligence); performance evaluation; RCV; basic significance testing; competing algorithms; early stopping; feature selection; generalization error control; independent testing; independent training; independent validation; machine learning; marked up data; multiple testing problem; outlier compensation; parameter tuning; performance estimates; repeated cross-validation; Accuracy; Correlation; Learning systems; Machine learning; Machine learning algorithms; Testing; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering and Technology (S-CET), 2012 Spring Congress on
  • Conference_Location
    Xian
  • Print_ISBN
    978-1-4577-1965-3
  • Type

    conf

  • DOI
    10.1109/SCET.2012.6342143
  • Filename
    6342143