مرکز منطقه ای اطلاع رساني علوم و فناوري - Early stopping for non-parametric regression: An optimal data-dependent stopping rule

DocumentCode :

2888122

Title :

Early stopping for non-parametric regression: An optimal data-dependent stopping rule

Author :

Raskutti, Garvesh ; Wainwright, Martin J. ; Yu, Bin

Author_Institution :

Dept. of Stat., UC Berkeley, Berkeley, CA, USA

fYear :

2011

fDate :

28-30 Sept. 2011

Firstpage :

1318

Lastpage :

1325

Abstract :

The goal of non-parametric regression is to estimate an unknown function f* based on n i.i.d. observations of the form y_i = f* (x_i) + w_i, where {w_i}_iⁿ=1 are additive noise variables. Simply choosing a function to minimize the least-squares loss 1/2n Σ_i=1ⁿ (y_i - f(x_i))² will lead to "overfitting", so that various estimators are based on different types of regularization. The early stopping strategy is to run an iterative algorithm such as gradient descent for a fixed but finite number of iterations. Early stopping is known to yield estimates with better prediction accuracy than those obtained by running the algorithm for an infinite number of iterations. Although bounds on this prediction error are known for certain function classes and step size choices, the bias-variance tradeoffs for arbitrary reproducing kernel Hilbert spaces (RKHSs) and arbitrary choices of step-sizes have not been well-understood to date. In this paper, we derive upper bounds on both the L²(P_n) and L²(P) error for arbitrary RKHSs, and provide an explicit and easily computable data-dependent stopping rule. In particular, it depends only on the sum of step-sizes and the eigenvalues of the empirical kernel matrix for the RKHS. For Sobolev spaces and finite-rank kernel classes, we show that our stopping rule yields estimates that achieve the statistically optimal rates in a minimax sense.

Keywords :

Hilbert spaces; eigenvalues and eigenfunctions; gradient methods; least squares approximations; matrix algebra; regression analysis; Sobolev spaces; additive noise variables; early stopping strategy; empirical kernel matrix eigenvalues; finite-rank kernel classes; function estimation; gradient descent; iterative algorithm; least-squares loss minimization; nonparametric regression; optimal data-dependent stopping rule; reproducing kernel Hilbert spaces; step-sizes; Complexity theory; Eigenvalues and eigenfunctions; Hafnium; Hilbert space; Iterative methods; Kernel; Upper bound;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Communication, Control, and Computing (Allerton), 2011 49th Annual Allerton Conference on

Conference_Location :

Monticello, IL

Print_ISBN :

978-1-4577-1817-5

Type :

conf

DOI :

10.1109/Allerton.2011.6120320

Filename :

6120320

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2888122