DocumentCode
3334492
Title
Note on generalization, regularization and architecture selection in nonlinear learning systems
Author
Moody, John E.
Author_Institution
Dept. of Comput. Sci., Yale Univ., New Haven, CT, USA
fYear
1991
fDate
30 Sep-1 Oct 1991
Firstpage
1
Lastpage
10
Abstract
The author proposes a new estimate of generalization performance for nonlinear learning systems called the generalized prediction error ( GPE ) which is based upon the notion of the effective number of parameters p eff(λ). GPE does not require the use of a test set or computationally intensive cross validation and generalizes previously proposed model selection criteria (such as GCV , FPE , AIC , and PSE ) in that it is formulated to include biased, nonlinear models (such as back propagation networks) which may incorporate weight decay or other regularizers. The effective number of parameters p eff(λ) depends upon the amount of bias and smoothness (as determined by the regularization parameter λ) in the model, but generally differs from the number of weights p . Construction of an optimal architecture thus requires not just finding the weights w ˆλ* which minimize the training function U (λ, w ) but also the λ which minimizes GPE (λ)
Keywords
backpropagation; generalisation (artificial intelligence); learning systems; neural nets; signal processing; architecture selection; back propagation networks; biased nonlinear networks; generalization; generalized prediction error; nonlinear learning systems; regularization; training function minimisation; weight decay; Adaptive signal processing; Computer architecture; Computer errors; Computer networks; Computer science; Internet; Learning systems; Noise generators; Supervised learning; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks for Signal Processing [1991]., Proceedings of the 1991 IEEE Workshop
Conference_Location
Princeton, NJ
Print_ISBN
0-7803-0118-8
Type
conf
DOI
10.1109/NNSP.1991.239541
Filename
239541
Link To Document