مرکز منطقه ای اطلاع رساني علوم و فناوري - Model complexity control for regression using VC generalization bounds

DocumentCode :

1551425

Title :

Model complexity control for regression using VC generalization bounds

Author :

Cherkassky, Vladimir ; Shao, Xuhui ; Mulier, Filip M. ; Vapnik, Vladimir N.

Author_Institution :

Dept. of Electr. & Comput. Eng., Minnesota Univ., Minneapolis, MN, USA

Volume :

Issue :

fYear :

1999

fDate :

9/1/1999 12:00:00 AM

Firstpage :

1075

Lastpage :

1089

Abstract :

It is well known that for a given sample size there exists a model of optimal complexity corresponding to the smallest prediction (generalization) error. Hence, any method for learning from finite samples needs to have some provisions for complexity control. Existing implementations of complexity control include penalization (or regularization), weight decay (in neural networks), and various greedy procedures (aka constructive, growing, or pruning methods). There are numerous proposals for determining optimal model complexity (aka model selection) based on various (asymptotic) analytic estimates of the prediction risk and on resampling approaches. Nonasymptotic bounds on the prediction risk based on Vapnik-Chervonenkis (VC)-theory have been proposed by Vapnik. This paper describes application of VC-bounds to regression problems with the usual squared loss. An empirical study is performed for settings where the VC-bounds can be rigorously applied, i.e., linear models and penalized linear models where the VC-dimension can be accurately estimated, and the empirical risk can be reliably minimized. Empirical comparisons between model selection using VC-bounds and classical methods are performed for various noise levels, sample size, target functions and types of approximating functions. Our results demonstrate the advantages of VC-based complexity control with finite samples

Keywords :

generalisation (artificial intelligence); learning by example; minimisation; statistical analysis; VC generalization bounds; Vapnik-Chervonenkis theory; approximating functions; dimension estimation; empirical risk minimization; finite-samples-based learning; generalization error; model complexity control; noise levels; nonasymptotic bounds; optimal complexity model; penalized linear models; regression; sample size; smallest prediction error; target functions; Control systems; Learning systems; Machine learning; Neural networks; Noise level; Predictive models; Proposals; Risk analysis; Training data; Virtual colonoscopy;

fLanguage :

English

Journal_Title :

Neural Networks, IEEE Transactions on

Publisher :

ieee

ISSN :

1045-9227

Type :

jour

DOI :

10.1109/72.788648

Filename :

788648

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1551425