DocumentCode :
2888122
Title :
Early stopping for non-parametric regression: An optimal data-dependent stopping rule
Author :
Raskutti, Garvesh ; Wainwright, Martin J. ; Yu, Bin
Author_Institution :
Dept. of Stat., UC Berkeley, Berkeley, CA, USA
fYear :
2011
fDate :
28-30 Sept. 2011
Firstpage :
1318
Lastpage :
1325
Abstract :
The goal of non-parametric regression is to estimate an unknown function f* based on n i.i.d. observations of the form yi = f* (xi) + wi, where {wi}in=1 are additive noise variables. Simply choosing a function to minimize the least-squares loss 1/2n Σi=1n (yi - f(xi))2 will lead to "overfitting", so that various estimators are based on different types of regularization. The early stopping strategy is to run an iterative algorithm such as gradient descent for a fixed but finite number of iterations. Early stopping is known to yield estimates with better prediction accuracy than those obtained by running the algorithm for an infinite number of iterations. Although bounds on this prediction error are known for certain function classes and step size choices, the bias-variance tradeoffs for arbitrary reproducing kernel Hilbert spaces (RKHSs) and arbitrary choices of step-sizes have not been well-understood to date. In this paper, we derive upper bounds on both the L2(Pn) and L2(P) error for arbitrary RKHSs, and provide an explicit and easily computable data-dependent stopping rule. In particular, it depends only on the sum of step-sizes and the eigenvalues of the empirical kernel matrix for the RKHS. For Sobolev spaces and finite-rank kernel classes, we show that our stopping rule yields estimates that achieve the statistically optimal rates in a minimax sense.
Keywords :
Hilbert spaces; eigenvalues and eigenfunctions; gradient methods; least squares approximations; matrix algebra; regression analysis; Sobolev spaces; additive noise variables; early stopping strategy; empirical kernel matrix eigenvalues; finite-rank kernel classes; function estimation; gradient descent; iterative algorithm; least-squares loss minimization; nonparametric regression; optimal data-dependent stopping rule; reproducing kernel Hilbert spaces; step-sizes; Complexity theory; Eigenvalues and eigenfunctions; Hafnium; Hilbert space; Iterative methods; Kernel; Upper bound;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication, Control, and Computing (Allerton), 2011 49th Annual Allerton Conference on
Conference_Location :
Monticello, IL
Print_ISBN :
978-1-4577-1817-5
Type :
conf
DOI :
10.1109/Allerton.2011.6120320
Filename :
6120320
Link To Document :
بازگشت