Title :
Note on generalization, regularization and architecture selection in nonlinear learning systems
Author_Institution :
Dept. of Comput. Sci., Yale Univ., New Haven, CT, USA
fDate :
30 Sep-1 Oct 1991
Abstract :
The author proposes a new estimate of generalization performance for nonlinear learning systems called the generalized prediction error ( GPE) which is based upon the notion of the effective number of parameters peff(λ). GPE does not require the use of a test set or computationally intensive cross validation and generalizes previously proposed model selection criteria (such as GCV, FPE, AIC, and PSE) in that it is formulated to include biased, nonlinear models (such as back propagation networks) which may incorporate weight decay or other regularizers. The effective number of parameters peff(λ) depends upon the amount of bias and smoothness (as determined by the regularization parameter λ) in the model, but generally differs from the number of weights p. Construction of an optimal architecture thus requires not just finding the weights wˆλ* which minimize the training function U(λ, w) but also the λ which minimizes GPE(λ)
Keywords :
backpropagation; generalisation (artificial intelligence); learning systems; neural nets; signal processing; architecture selection; back propagation networks; biased nonlinear networks; generalization; generalized prediction error; nonlinear learning systems; regularization; training function minimisation; weight decay; Adaptive signal processing; Computer architecture; Computer errors; Computer networks; Computer science; Internet; Learning systems; Noise generators; Supervised learning; Testing;
Conference_Titel :
Neural Networks for Signal Processing [1991]., Proceedings of the 1991 IEEE Workshop
Conference_Location :
Princeton, NJ
Print_ISBN :
0-7803-0118-8
DOI :
10.1109/NNSP.1991.239541