DocumentCode :
3229212
Title :
Consistent inference of probabilities in layered networks: predictions and generalizations
Author :
Tishby, Naftali ; Levin, Esther ; Solla, Sara A.
Author_Institution :
AT&T Bell Lab., Murray Hill, NJ, USA
fYear :
1989
fDate :
0-0 1989
Firstpage :
403
Abstract :
The problem of learning a general input-output relation using a layered neural network is discussed in a statistical framework. By imposing the consistency condition that the error minimization be equivalent to a likelihood maximization for training the network, the authors arrive at a Gibbs distribution on a canonical ensemble of networks with the same architecture. This statistical description enables them to evaluate the probability of a correct prediction of an independent example, after training the network on a given training set. The prediction probability is highly correlated with the generalization ability of the network, as measured outside the training set. This suggests a general and practical criterion for training layered networks by minimizing prediction errors. The authors demonstrate the utility of this criterion for selecting the optimal architecture in the continuity problem. As a theoretical application of the statistical formalism, they discuss the question of learning curves and estimate the sufficient training size needed for correct generalization, in a simple example.<>
Keywords :
learning systems; neural nets; probability; Gibbs distribution; consistency condition; error minimization; general input-output relation; layered networks; learning; learning curves; neural network; optimal architecture; probabilities; statistical formalism; statistical framework; training set; training size; Learning systems; Neural networks; Probability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 1989. IJCNN., International Joint Conference on
Conference_Location :
Washington, DC, USA
Type :
conf
DOI :
10.1109/IJCNN.1989.118274
Filename :
118274
Link To Document :
بازگشت